-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for Excelx (xlsx) Exporter #64
base: master
Are you sure you want to change the base?
Conversation
# `Daru::DataFrame` instance variables | ||
class Excelx < Base | ||
Daru::DataFrame.register_io_module :to_excelx_string, self | ||
Daru::DataFrame.register_io_module :write_excelx, self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from standalone xlsx exporter, doesn't it make sense for the excel exporter to redirect here incase of .xlsx
filename in excel exporter's write method? Though, for such a redirect to be possible, they would need to have similar kind of arguments. That is, formatting has to be supported by xlsx exporter like excel exporter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure. Yep, it looks humane, but if/when you'll add formatting support for Excel exporters -- conventions on formatting are pretty different in two versions of Excel files and corresponding libraries.
In fact, I believe that in 2017 support for Excel 1999-2003 format (xls) should be done in "maintenance" mode, while xlsx exporter should be shiny and friendly.
string = df.to_excelx_string(index: false) | ||
df.write_excel('path/to/file.xlsx', index: false) | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am starting to think this part of README becames too long and repetitive. Just as an idea for the future:
- Leave 2-3 most "tasty" examples in REDME + link to...
- Formats.md with regular structure...
- Which is built automatically, gathering the comments from top of all exporters/importers files, with similar format.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For generating files like {FORMAT}_Importer.md
, does an ERB template work? Or is there a better option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's think. In my head, it (probably) does the following:
- takes all the importers (exporters) code;
- extract, say, top-of-the-file comment with explanations;
- join them together with pretty headers;
- store to
Formats.md
(one, not document-per-formatter, it is tiresome to browse). -
- probably generates TOC on the top of the file;
-
- probably adds some "preface" at the top of the file (taken from some
_preface.md
).
- probably adds some "preface" at the top of the file (taken from some
Can be implemented with just Ruby code, or small ERB template, which is the simplest.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -80,6 +80,7 @@ def write(path) | |||
contents.each { |content| csv << content } | |||
csv.close | |||
end | |||
true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that this could act as a consistent means of checking whether the "writing into a file" actually took place. Else, the File.write(...)
just returns the number of characters written - which may not mean much to the user/developer compared to a consistent (constant) true
.
It's not really necessary, but I thought this would be better. Let me know if this can be reverted.
# `Daru::DataFrame` instance variables | ||
class Excelx < Base | ||
Daru::DataFrame.register_io_module :to_excelx_string, self | ||
Daru::DataFrame.register_io_module :write_excelx, self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure. Yep, it looks humane, but if/when you'll add formatting support for Excel exporters -- conventions on formatting are pretty different in two versions of Excel files and corresponding libraries.
In fact, I believe that in 2017 support for Excel 1999-2003 format (xls) should be done in "maintenance" mode, while xlsx exporter should be shiny and friendly.
super(file_extension: '.xlsx') | ||
end | ||
|
||
# Exports an Excelx Exporter instance to an xlsx file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it doesn't export the exporter, it "exports the dataframe", or "performs the export(er)".
(If it is standard phrase for all exporters, this notice is related to all of them).
# @param path [String] Path of excelx file where the dataframe is to be saved | ||
# | ||
# @example Writing an Excelx Exporter instance to an xlsx file | ||
# instance.write('filename.xlsx') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that such a simple example brings any clarity. Just remove it?
def process_offsets | ||
@row_offset = @header ? 1 : 0 | ||
@col_offset = 0 unless @index | ||
@col_offset ||= @dataframe.index.is_a?(Daru::MultiIndex) ? @dataframe.index.width : 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this or similar statement repeated in other places in codebase (calculating of index witdth)? Maybe move it to some utility module, or to base exporter?
@col_offset ||= @dataframe.index.is_a?(Daru::MultiIndex) ? @dataframe.index.width : 1 | ||
end | ||
|
||
def fetch_headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "fetch" a good name? "Fetch" implies some external source to be used, while this method in fact "formats" or "renders" its own instance variables.
case idx | ||
when Daru::Vector, Daru::MultiIndex, Array then idx.map(&:to_s) | ||
else [idx.to_s] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't Array(idx).map(&:to_s)
work instead of this case
?..
@data = data | ||
@index = index | ||
@sheet = sheet | ||
@header = header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, do we have this lot of boolean vars in other importers too? I believe that one hash @render_parts = {data: data, headers: headers, index: index}
could be more effective and will allow to generalize a lot. In addition, looking at var names you can wrongly conclude that, say, @index
contains dataframe's index, not "if we need to render index" boolean.
formatting(row, @data) | ||
end | ||
|
||
def formatting(idx, format) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the best name, as we'll add "real" Excel formatting once. Something like just to_a
?.. Like this:
# or "to_strings"?..
def to_a(object, render = true)
return [] unless render
Array(object).map(&:to_s)
end
BTW, have you checked that to_s
is good enough? For example, wouldn't numbers or dates be formatted as strings, instead of appropriate Excel column types?
Oops, I didn't notice that this Pull Request had already been reviewed. I'll soon push the changes for the next review. Meanwhile, I've also opened #71 for keeping track of, in a different PR. |
write
exporter methods, which yet again draws attention to handle Add tests for linkages between daru & daru-io calls #55 ASAP)sheet
and boolean values forindex
,header
anddata
(show them or not). And of course,filename
for write method. Fix Support for Excelx Exporter #63.write
methods returntrue
incase of no mishaps when the particular exporter is called.