Replies: 1 comment
-
DataFusion isn't going to ever directly solve those problems, since its core value prop is to create a set of building blocks that would make it easy (or at least easier) for someone to build a system like you are describing. You may want to take a look at the projects that use DataFusion to see if any do what you want: https://datafusion.apache.org/user-guide/introduction.html#known-users |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My understanding is that DataFusion is primarily an extensible query engine for engineers looking to build database systems (Influx and so on) without reinventing the wheel.
Having said that, I can see it has a Rust-based DataFrame API and SQL context at a high enough abstraction layer that it's tempting to start building Data Engineering pipelines in pure Rust. 😄
An example of something I'd love to be able to do with DataFusion (I know some of this is already possible):
Ideally, the above should be possible by a Data Engineer who doesn't have database internals domain knowledge. I also appreciate that because DataFusion uses Arrow for its memory format some of this might be wishful thinking (or at least not straightforward to implement).
Is there a chance DataFusion will evolve in this direction or will the focus remain on database systems? Is anyone else in the community using DataFusion in Rust for Data Engineering and if so, what is your experience?
Beta Was this translation helpful? Give feedback.
All reactions