Add ability to read .dataset formats (deserialization) #6

andrie · 2015-08-03T16:52:32Z

The .dataset format is used as the output of most modules in ML Studio (intermediate datasets). For example, the Split module results are in that format.

Studio currently disables the Generate Data Access Code and Open in Notebook features on those output nodes due to lack of deserialization support for that format in Python.

To access those intermediate datasets from Python code, the user needs to insert a Convert to CSV module. Note that this conversion loses some metadata, such as column type information. Pandas can infer the types most of the time, but sometimes it requires user post-processing.

piccolbo · 2015-08-03T21:55:33Z

OK, but this is not in python version yet so I think we could focus on python parity first, because we are behind on many things and having a reference implementation in place is a big help. So I am saying absolutely, but slightly lower priority.

piccolbo · 2015-08-04T16:58:48Z

I didn't see a spec of the format in the material you attached to your email message. If you run into something more detailed can you post it here? Thanks

bwlewis · 2015-10-22T06:46:33Z

datasets format requires .NET. A description of Dataset can be found here:
https://msdn.microsoft.com/en-us/library/azure/dn905850.aspx

The underlying thing is referred to as a "Data Table" which is an object of .NET class "Array":

https://msdn.microsoft.com/library/system.array.aspx

On non-windows platforms, mono would be required to read this. It's a pain and probably not worth dealing with right now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to read .dataset formats (deserialization) #6

Add ability to read .dataset formats (deserialization) #6

andrie commented Aug 3, 2015

piccolbo commented Aug 3, 2015

piccolbo commented Aug 4, 2015

bwlewis commented Oct 22, 2015

Add ability to read .dataset formats (deserialization) #6

Add ability to read .dataset formats (deserialization) #6

Comments

andrie commented Aug 3, 2015

piccolbo commented Aug 3, 2015

piccolbo commented Aug 4, 2015

bwlewis commented Oct 22, 2015