-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implemented dataframe.cov #2142
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2142 +/- ##
==========================================
- Coverage 95.37% 93.14% -2.24%
==========================================
Files 60 60
Lines 13694 13601 -93
==========================================
- Hits 13060 12668 -392
- Misses 634 933 +299
Continue to review full report at Codecov.
|
] | ||
kdf = self[num_cols] | ||
names = [name for t in num_cols for name in t] | ||
mat = kdf.to_pandas().to_numpy(dtype=float, copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid using to_pandas()
without any restriction is not a good idea. It will cause OOM if the data side doesn't fit in a driver's memory.
Hi @LSturtew, since Koalas has been ported to Spark as pandas API on Spark, would you like to migrate this PR to the Spark repository? Here is the ticket https://issues.apache.org/jira/browse/SPARK-36396. Otherwise, I may do that for you next week. |
ref #1929
Implement DataFrame.cov