Bug fixes
- Fix #458, custom S3 import and export functions work again
- Fix #453, don't nudge the user to install all suggested packages
- Fix #451, don't nudge the user to report issues about the trust parameter
- Fix #447 - remove an ancient artefact of Vignette generation, h/t Tim Taylor for the help.
- Roll back the decision to add parquet in the import tier see #455 #315
- Fix lintr issues #434 (h/t @bisaloo Hugo Gruson)
- Drop support for R < 4.0.0 see #436
- Add support for parquet in the import tier using
nanoparquet
see rio 1.0.1 below.
Bug fixes
- Fix #430 Add back support for
.dat
Bug fixes
- Fix #425 for archive formats, the file extension of the input file is determined by the compressed file (like prior rio 1.1.0)
- CRAN release
- Add
trust
parameter to functions that are used to load various R environment formats (.R
,.Rds
,.Rdata
, etc). This parameter is defaulted toTRUE
today to ensure backwards compatibility. A deprecation notice warns this will default toFALSE
inrio
2.0. We are informing users that these data types should only be loaded from trusted sources, which should be affirmatively attested to. - Test and fix the compression mechanism: Gzip, Bzip2 are now working as expected.
Bug fixes
- Fix #412, prevent double usage of
which
for archive formats - Fix #415, both
import_list()
andexport_list()
support tar archives. - Fix #421, tar export is only supported by R >= 4.0.3.
- For missing files in
import_list
it gives more informative warnings fix #389 - Single-item list of data frames can be exported fix #385
- Move
stringi
to Suggests to reduce compilation time. Add an attribution to the internal data to list out all required packages #378 - Move
readr
to Imports forfwf
.readr
is a dependency ofhaven
so it does not increase the number of dependencies. Remove the originalread.fwf2
which doesn't guesswidths
. Keep thewidths
andcol.names
to maintain compatibility. #381 - Add (back) a pkgdown website: https://gesistsa.github.io/rio/
- Update all test cases #380
- POTENTIALLY BREAKING: Due to compiling time concerns, roll back the decision to move
arrow
toImports
. It is nowSuggests
.setclass = "arrow"
works ifarrow
is installed. #315 #376
- Stop loading the entire namespace of a suggested package when it is available #296
- Unexport objects:
.import
,.export
,is_file_text
; remove documentation forarg_reconcile
#321 - Update Examples to make them more realistic #327
- Add support for
qs
#275 h/t David Schoch - Use
arrow
to import / exportfeather
#340 export_list
can write multiple data frames to a single archive file (e.g. zip, tar) or a directory #346 h/t David Schochget_info
is added #350- POTENTIALLY BREAKING:
setclass
parameter is now authoritative. Therefore:import("starwars.csv", data.table = TRUE, setclass = "tibble")
will return a tibble (unlike previous versions where a data.table is returned). The default class is data frame. You can either explicitly use thesetclass
parameter; or set the option:options(rio.import.class = "data.table")
. h/t David Schoch #336 - Parquet and feather are now formats supported out of the box; Possible to setclass to
arrow
/arrow_table
; ArrowTabular class can be exported #315 - Add "extension", "labelled" vignettes
- Support readODS 2.1.0 features such as reading and writing Flat ODS; export Multiple data frames #358
- POTENTIALLY BREAKING: Use
writexl
instead ofopenxlsx
. Option to read xlsx withopenxlsx
(i.e.import("starwars.xlsx", readxl = FALSE)
) is alwaysTRUE
. The ability to overwrite an existing sheet in an existing xlsx file is also removed. It is against the design principle ofrio
. - POTENTIALLY BREAKING: The following options are deprecated:
import(fread)
,import(readr = TRUE)
,import(haven)
,import(readxl)
andexport(fwrite)
. import will almost usedata.table
,haven
,readxl
, and internal function (for fwf) to import and export data. Currently, those options stay for backward compatibility but will be removed in v2.0.0. #343 h/t David Schoch - POTENTIALLY BREAKING:
...
is handled differently. Underlying functions using "Tidy" convention (e.g.readxl::read_xlsx()
) can use "Base Convention" (See the new vignette:remap
). Unused arguments passed to the underlying function as...
are silently ignored by default. A new optionrio.ignoreunusedargs
is added to control this behavior. #326 - Bug fixes
- ... is correctly passed for exporting ODS and feather #318
- POTENTIALLY BREAKING: JSON are exported in UTF-8 by default; solved encoding issues on Windows R < 4.2. This won't affect any modern R installation where UTF-8 is the default. #318
- POTENTIALLY BREAKING: YAML are exported using yaml::write_yaml(). But it can't pass the UTF-8 check on older systems. Disclaimer added. #318
- More check for the
file
argument #301 import_list
works with single Excel/HTML/Zip online #294- Correct XML/HTML escaping #303
- Create directory if it doesn't exist #347
- Declutter
- remove the obsolete data.table option #323
- write all documentation blocks in markdown #311
- remove all @importFrom #325 h/t David Schoch
- rearrange "Package Philosophy" as a Vignette #320
- Create a single source of truth about all import and export functions #313
- Clarify all concepts: now there is only
format
#351
- New authors
- David Schoch @schochastics
- Maintenance release: new maintainer
- Mark
.sas7bdat
as deprecated - Change the minimum R version to 3.6
- fixes for CRAN
- Various fixes to tests, examples, and documentation for CRAN.
- Temporarily disabled some tests that failed on Mac M1s.
- Documentation fixes for CRAN.
- Added support for "zsav" format. (#273)
- Modified tests per email request from CRAN.
- Added
coerce_character
argument (default FALSE) tofactorize()
to enable coercing character columns to factor. (#278)
- Fix handling of "label" and "labels" attributes when exporting using haven methods (SPSS, Stata, SAS). (#268, h/t Ruben Arslan)
- Fix (a different bug?) handling factors by haven::labelled() (#271, Alex Bokov)
- HTML import can now handle multiple tbody elements within a single table, a th element in a non-header row, and empty elements in either the header or data. (#260, #263, #264 Bill Denney)
- CSVY support is now provided by
data.table::fread()
anddata.table::fwrite()
, providing significant performance gains. - Added an internal
arg_reconcile()
function to streamline the task of removing/renaming arguments for compatibility with various functions (#245, Alex Bokov)
- Added an
export_list()
function to write a list of data frames to multiple files using a vector of file names or a file pattern. (#207, h/t Bill Denney) - Added an
is_file_text()
function to determine whether a file is in a plain-text format. Optionally narrower subsets of characters can be specified, e.g. ASCII. (#236 Alex Bokov)
- Added support for Apache Arrow (Parquet) files. (#214)
- Fix dropping of variable label in
characterize()
andfactorize()
. (#204, h/t David Armstrong) import_list()
now returns afilename
attribute for each data frame in the list (when importing from multiple files), in order to distinguish files with the same base name but different extensions (e.g.,import_list(c("foo.csv", "foo.tsv"))
). (#208, h/t Vimal Rawat)- Import of DBF files now does not convert strings to factors. (#202, h/t @jllipatz)
- Implemented
import()
method for .dump R files. (#240)
- Additional pointers were added to indicate how to load .doc, .docx, and .pdf files (#210, h/t Bill Denney)
- Ensure that tests only run if the corresponding package is installed. (h/t Bill Denney)
- Escape ampersands for html and xml export (#234 Alex Bokov)
- Fix behavior of
export()
to plain text files whenappend = TRUE
(#201, h/t JuliΓ‘n Urbano) import_list()
now preserve names of Excel sheets, etc. when the 'which' argument is specified. (#162, h/t Danny Parsons)- Modify message and errors when working with unrecognized file formats. (#195, h/t Trevor Davis)
- Add support for GraphPad Prism .pzfx files (#205, h/t Bill Denney)
- Adjust
import()
/export()
for JSON file formats to allow non-data frame objects. Behavior modeled after RDS format. (#199 h/t Nathan Day)
- Fix
the condition has length > 1 and only the first element will be used
warning ingather_attributes()
. (#196, h/t Ruben Arslan)
- Fix
the condition has length > 1 and only the first element will be used
warning instandardize_attributes()
.
- Modified some further code to produce compatibility with haven 2.0.0 release. (#188)
- Add some additional function suggestions for the ledger package. (#190, h/t Trevor Davis)
- Changes to
gather_attrs()
for haven 2.0.0 release. (#188) - Fixed a bug that generated a superfluous warning in
import()
. - Some style guide changes to code.
- Allow
import()
of objects other than data frames from R-serialized (.rds and .rdata) files. Also, export of such objects to .rds files is supported, as previously intended. (#183, h/t Nicholas Jhirad) - Added (suggests) support for import of EViews files using
hexView::readEViews()
. (#163, h/t Boris Demeshev)
- Add better package specification to
install_formats()
so that it reads from theSuggests
field of theDESCRIPTION
file. - Edit header of
README.Rmd
(and thuslyREADME.md
) to stop complaining about a lack of title field. - Fix typo in
CONTRIBUTING.md
(line said "three arguments", but only listed two).
- Fixed a bug in
import()
wherein matlab files were ignored unlessformat
was specified, as well as a related bug that made importing appear to fail for matlab files. (#171) - Fixed a bug in
export()
whereinformat
was ignored. (#99, h/t Sebastian Sauer) - Fixed a bug in the importing of European-style semicolon-separated CSV files. Added a test to ensure correct behavior. (#159, h/t Kenneth Rose)
- Updated documentation to reflect recent changes to the xlsx
export()
method. (#156)
- Removed some csvy-related tests, which were failing on CRAN.
- Removed longstanding warnings from the tests of
export()
to fixed-width format.
- Export the
get_ext()
function. (#169) - Fix a bug related to an xml2 bug (#168, h/t Jim Hester)
import_list()
gains improved file name handling. (#164, h/t Ruaridh Williamson)- Removed the
overwrite
argument fromexport()
method for xlsx files. Instead, existing workbooks are always overwritten unless which is specified, in which case only the specified sheet (if it exists) is overwritten. If the file exists but thewhich
sheet does not, the data are added as a new sheet to the existing workbook. (#156)
- Import of files with the ambiguous .dat extension, which are typically text-delimited files, are now passed to
data.table::fread()
with a message. Export to the format remains unsupported. (#98, #155) - Added support for export to SAS XPORT format (via
haven::write_xpt()
). (#157) - Switched default import package for SAS XPORT format to
haven::read_xpt()
with ahaven = FALSE
toggle restoring the previous default behavior usingforeign::read.xpt()
. (#157)
- Fixed a bug in
import()
from compressed files wherein thewhich
argument did not necessarily return the correct file if >=2 files in the compressed folder. - Tweak handling of
export()
to xlsx workbooks whenwhich
is specified. (#156)
- Expanded test suite and increased test coverage, fixing a few tests that were failing on certain CRAN builds.
- New functions
characterize()
andfactorize()
provide methods for converting "labelled" variables (e.g., from Stata or SPSS) into character or factor variables using embedded metadata. This can also be useful for exporting a metadata-rich file format into a plain text file. (#153)
- Fixed a bug in writing to .zip and .tar archives related to absolute file paths.
- Fixed some small bugs in
import_list()
and added tests for behavior. - Add .bib as known-unsupported format via
bib2df::bib2df()
. - Expanded test coverage.
- Fixed a bug in
.import.rio_xlsx()
whenreadxl = FALSE
. (#152, h/t Danny Parsons) - Added a new function
spread_attrs()
that reverses thegather_attrs()
operation. - Expanded test coverage.
export()
now sets variables with a "labels" attribute to haven's "labelled" class.
- CRAN Release.
- Restored import of openxlsx so that writing to xlsx is supported on install. (#150)
- Improved documentation of mapping between file format support and the packages used for each format. (#151, h/t Patrick Kennedy)
import_list()
now returns aNULL
entry for any failed imports, with a warning. (#149)import_list()
gains additional argumentsrbind_fill
andrbind_label
to control rbind-ing behavior. (#149)
- Import to and export from the clipboard now relies on
clipr::read_clip()
andclipr::write_clip()
, respectively, thus (finally) providing Linux support. (#105, h/t Matthew Lincoln) - Added an
rbind
argument toimport_list()
. (#149) - Added a
setclass
argument toimport_list()
, ala the same inimport()
. - Switched
requireNamespace()
calls toquietly = TRUE
.
- Further fixes to .csv.gz import/export. (#146, h/t Trevor Davis)
- Remove unecessary urltools dependency.
- New function
import_list()
returns a list of data frames from a multi-object Excel Workbook, .Rdata file, zip directory, or HTML file. (#126, #129) export()
can now write a list of data frames to an Excel (.xlsx) workbook. (#142, h/t Jeremy Johnson)export()
can now write a list of data frames to an HTML (.html) file.
- Verbosity of
export(format = "fwf")
now depends onoptions("verbose")
. - Fixed various errors, warnings, and messages in fixed-width format tests.
- Modified defaults and argument handling in internal function
read_delim()
. - Fixed handling of "data.table", "tibble", and "data.frame" classes in
set_class()
. (#144)
- Moved all non-critical format packages to Suggests, rather than Imports. (#143)
- Added support for Matlab formats. (#78, #98)
- Added support for fst format. (#138)
- Rearranged README.
- Bumped readxl dependency to
>= 0.1.1
(#130, h/t Yongfa Chen) - Pass explicit
excel_format
arguments when using readxl functions. (#130) - Google Spreadsheets can now be imported using any of the allowed formats (CSV, TSV, XLSX, ODS).
- Added support for writing to ODS files via
readODS::write_ods()
. (#96)
- Handle HTML tables with
<tbody>
elements. (h/t Mohamed Elgoussi)
- Fixed a big in the
.import.rio_xls()
and.import.rio_xlsx()
where thesheet
argument would return an error.
- Fixed a bug in the import of delimited files when
fread = FALSE
. (#133, h/t Christopher Gandrud)
- With new data.table release, export using
fwrite()
is now the default for text-based file formats.
- Fixed a bug in
.import.rio_xls()
wherein thewhich
argument was ignored. (h/t Mohamed Elgoussi)
- Added support for importing from multi-table HTML files using the
which
argument. (#126)
- Improved behavior of
import()
andexport()
with respect to unrecognized file types. (#124, #125, h/t Jason Becker) - Added explicit tests of the S3 extension mechanism for
.import()
and.export()
. - Attempt to recognize compressed but non-archived file formats (e.g., ".csv.gz"). (#123, h/t trevorld)
- Update import and export methods to use new xml2 for XML and HTML export. (#86)
- Fix failing tests related to stricter variable name handling for Stata files in development version of haven. (#113, h/t Hadley Wickham)
- Added support for export of .sas7bdat files via haven (#116)
- Restored support for import from SPSS portable via haven (#116)
- Updated import methods to reflect changed formal argument names in haven. (#116)
- Converted to roxygen2 documentation and made NEWS an explicit markdown file.
- rio sets
options(datatable.fread.dec.experiment=FALSE)
during onLoad to address a Unix-specific locale issue.
- Note unsupported NumPy i/o via RcppCNPy. (#112)
- Fix import of European-style CSV files (sep = "," and sep2 = ";"). (#106, #107, h/t Stani Stadlmann)
- Changed feather Imports to Suggests to make rio installable on older R versions. (#104)
- Noted new RStudio add-in, GREA, that uses rio. (#109)
- Migrated CSVY-related code to separate package (https://github.com/leeper/csvy/). (#111)
- Removed unnecessary error in xlsx imports. (#103, h/t Kevin Wright)
- Fixed a bug in the handling of "labelled" class variables imported from haven. (#102, h/t Pierre LaFortune)
- Improved use of the
sep
argument for import of delimited files. (#99, h/t Danny Parsons) - Removed support for import of SPSS Portable (.por) files, given deprecation from haven. (#100)
- Fixed other tests to remove (unimportant) warnings.
- Fixed a failing test of file compression that was found in v0.4.3 on some platforms.
- Improved, generalized, tested, and expanded documentation of
which
argument inimport()
. - Expanded test suite and made some small fixes.
- Added support to import and export to
feather
data serialization format. (#88, h/t Jason Becker)
- Fixed behavior of
gather_attrs()
on a data.frame with no attributes to gather. (#94) - Removed unrecognized file format error for import from compressed files. (#93)
- CRAN Release.
- Added a
gather_attrs()
function that moves variable-level attributes to the data.frame level. (#80) - Added preliminary support for import from HTML tables (#86)
- Added support for export to HTML tables. (#86)
- Fixed a bug in import from remote URLs with incorrect file extensions.
- Added support for import from fixed-width format files via
readr::read_fwf()
with a specifiedwidths
argument. This may enable faster import of these types of files and provides a base-like interface for working with readr. (#48)
- Added support for import from and export to yaml. (#83)
- Fixed a bug when reading from an uncommented CSVY yaml header that contained single-line comments. (#84, h/t Tom Aldenberg)
- Diagnostic messages were cleaned up to facilitate translation. (#57)
.import()
and.export()
are now exported S3 generics and documentation has been added to describe how to write rio extensions for new file types. An example of this functionality is shown in the new suggested "rio.db" package.
import()
now uses xml2 to read XML structures andexport()
uses a custom method for writing to XML, thereby negating dependency on the XML package. (#67)- Enhancements were made to import and export of CSVY to store attribute metadata as variable-level attributes (like imports from binary file formats).
import()
gains awhich
argument that is used to select which file to return from within a compressed tar or zip archive.- Export to tar now tries to correct for bugs in
tar()
that are being fixed in base R via PR#16716.
- Fixed a bug in
import()
(introduced in #62, 7a7480e5) that prevented import from clipboard. (h/t Kevin Wright) export()
returns a character string. (#82)
- The use of
import()
for SAS, Stata, and SPSS files has been streamlined. Regardless of whether thehaven = TRUE
argument is used, the data.frame returned byimport()
should now be (nearly) identical, with all attributes stored at the variable rather than data.frame level. This is a non-backwards compatible change. (#80)
- Fixed error in export to CSVY with a commented yaml header. (#81, h/t Andrew MacDonald)
export()
now allows automatic file compression as tar, gzip, or zip using thefile
argument (e.g.,export(iris, "iris.csv.zip")
).
- Expanded verbosity of
export()
for fixed-width format files and added a commented header containing column class and width information. - Exporting factors to fixed-width format now saves those values as integer rather than numeric.
- Expanded test suite and separated tests into format-specific files. (#51)
- Export of CSVY files now includes commenting the yaml header by default. Import of CSVY accommodates this automatically. (#74)
- Export of CSVY files and metadata now supported by
export()
. (#73) - Import of CSVY files now stores dataset-level metadata in attributes of the output data.frame. (#73, h/t Tom Aldenberg)
- When rio receives an unrecognized file format, it now issues a message. The new internal
.import.default()
and.export.default()
then produce an error. This enables add-on packages to support additional formats through new s3 methods of the form.import.rio_EXTENSION()
and.export.rio_EXTENSION()
.
- Use S3 dispatch internally to call new (unexported)
.import()
and.export()
methods. (#42, h/t Jason Becker)
- Release to CRAN.
- Set a default numerical precision (of 2 decimal places) for export to fixed-width format.
- Import stats package for
na.omit()
.
- Added support for direct import from Google Sheets. (#60, #63, h/t Chung-hong Chan)
- Refactored remote file retrieval into separate (non-exported) function used by
import()
. (#62) - Added test sutie to test file conversion.
- Expanded test suite to include test of all export formats.
- Cleaned up NAMESPACE file.
- If file format for a remote file cannot be identified from the supplied URL or the final URL reported by
curl::curl_fetch_memory()
, the HTTP headers are checked for a filename in the Content-Disposition header. (#36) - Removed longurl dependency. This is no longer needed because we can identify formats using curl's url argument.
- Fixed a bug related to importing European-style ("csv2") format files. (#44)
- Updated CSVY import to embed variable-level metadata. (#52)
- Use
urltools::url_parse()
to extract file extensions from complex URLs (e.g., those with query arguments). (#56) - Fixed NAMESPACE notes for base packages. (#58)
- Modified behavior so that files imported using haven now store variable metadata at the data.frame level by default (unlike the default behavior in haven, which can cause problems). (#37, h/t Ista Zahn)
- Added support for importing CSVY (http://csvy.org/) formatted files. (#52)
- Added import dependency on data.table 1.9.5. (#39)
- Uses the longurl package to expand shortened URLs so that their file type can be easily determined.
- Improved support for importing from compressed directories, especially web-based compressed directories. (#38)
- Add import dependency on curl >= 0.6 to facilitate content type parsing and format inference from URL redirects. (#36)
- Add bit64 to
Suggests
to remove animport
warning.
import
always returns a data.frame, unlesssetclass
is specified. (#22)- Added support for import from legacy Excel (.xls) files
readxl::read_excel
, making its use optional. (#19) - Added support for import from and export to the system clipboard on Windows and Mac OS.
- Added support for export to simple XML documents. (#12)
- Added support for import from simple XML documents via
XML::xmlToDataFrame
. (#12) - Added support for import from ODS spreadsheet formats. (#12, h/t Chung-hong Chan)
- Use
data.table::fread
by default for reading delimited files. (#3) - Added support for import and export of
dput
anddget
objects. (#10) - Added support for reading from compressed archives (.zip and .tar). (#7)
- Added support for writing to fixed-width format. (#8)
- Set
stringsAsFactors = FALSE
as default for reading tabular data. (#4) - Added support for HTTPS imports. (#1, h/t Christopher Gandrud)
- Added support for automatic file naming in
export
based on object name and file format. (#5) - Exposed
convert
function. - Added vignette, knitr-generated README.md, and updated documentation. (#2)
- Added some non-exported functions to simplify argument passing and streamline package API. (#6)
- Separated
import
,export
,convert
, and utilities into separate source code files. - Expanded the set of supported file types/extensions, switched SPSS, SAS, and Stata formats to haven, making its use optional.
- Updated documentation and fixed a bug in csv import without header.
- Initial release