Replies: 1 comment 2 replies
-
On the parquet question, I've looked through the code, and see that in Awkward's use, all calls go through The exception is In short, we need to percolate a filesystem through the awkward code and replace Note that the dask implementation essentially does all the filesystem and paths stuff and makes a graph before calling |
Beta Was this translation helpful? Give feedback.
-
ak.Array
; projects likedask.array
anddask.frame
are distinct from NumPy and Pandas, but the distinction is not maintained within xarray. @martindurant is of the opinion that it's better for them to be separate, so that calling.compute()
or.persist()
is a big deal, very visible to the user. I can be persuaded, I just want to think about it and be sure that we're not giving up an opportunity. This project is different fromdask.array
anddask.frame
in that we can change Awkward's v2 development to fit, to make a more unified experience for users.ak.from_parquet
anddak.from_parquet
(import dask_awkward as dak
) should be separate functions (as they were in @douglasdavis’s demo, with each returning a different Python type:ak.Array
vsdak.AwkwardDaskArray
), or if there should be one function with alazy=True
flag. This is related to @agoose77's question on Gitter aboutak.broadcast_arrays
's arguments vs multiple functions, but it's addressing a more fundamental split ifak.Array
is not the same Python type asdak.AwkwardDaskArray
(even by inheritance)..compute()
or.persist()
is a big deal, very user-visible, then we might be able to dispense with caching. Same-nodes within a DAG are not recomputed in one.compute()
as part of Dask's infrastructure, but if a user calls.compute()
twice, the file will be read twice and the statements will be computed twice. Users can explicitly set that to a variable. (Caching was important when computation was invoked implicitly.)Questions about replacing Awkward's internal VirtualArray and PartitionedArray with Dask and an explicit, user-level
.compute()
would affect Coffea, especially NanoEvents, so @nsmith- will want to know that we're discussing it.Beta Was this translation helpful? Give feedback.
All reactions