You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Spark program should work even if portions of the dataset are too large to fit in memory.
Actual Behavior
We suspect that there may be scalability issues in ModelFunction. We have provided ModelFunctionDataset, which resolves this but ran slower when we tested it. If there are issues, simply replace ModelFunction with ModelFunctionDataset in the Driver program. Look into optimization of the dataset logic in ModelFunctionDataset.
Steps to Reproduce the Problem
Generate gigabytes of data
Test the code as-is on an EMR cluster.
Determine if an error occurs.
The text was updated successfully, but these errors were encountered:
Expected Behavior
The Spark program should work even if portions of the dataset are too large to fit in memory.
Actual Behavior
We suspect that there may be scalability issues in ModelFunction. We have provided ModelFunctionDataset, which resolves this but ran slower when we tested it. If there are issues, simply replace ModelFunction with ModelFunctionDataset in the Driver program. Look into optimization of the dataset logic in ModelFunctionDataset.
Steps to Reproduce the Problem
The text was updated successfully, but these errors were encountered: