-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION/ENH] Java / VelocyPack <> Apache Arrow / Graphistry #77
Comments
cc @jsteemann as you seem to be the main contact for this :) Helpful links:
|
@lmeyerov : Hi, thanks for getting in touch. Let me check who will be the contact on our side. It will not necessarily be me. Need to check it internally first. Will get back once I have more info! |
Thanks @jsteemann ! If it helps, we're ultimately interested in a few integration points: -- converting arango query responses into arrow-typed record or arrow-typed node+edge property tables, e.g., https://github.com/graphistry/pygraphistry/blob/master/demos/demos_databases_apis/arango/arango_tutorial.ipynb except with types We'd love to help the Java-using arangodb team be successful now, and are gearing up for a public native arango connector in q1 :) |
@lmeyerov did you ever complete your native Graphistry<->ArangoDB connector? |
Hi @grepler we have arangodb<>graphistry users combining via pydata envs like jupyter notebooks & streamlit dashboards, via our respective JS APIs, and I'm unsure with our REST API no-code/low-code (so no python/js/...) is a longer story. we're starting to do more customer-funded projects around roadmap items, so def something we're watching out for. if relevant, happy to chat! |
Thanks @lmeyerov, I'll keep experimenting - bi-directional exploration & tagging interaction with the graph model would be amazing, but I will see if I can get by with one-way visualization of our AQL graph for the time being. +1 for more ArangoDB tooling adoption! Will keep your offer in mind as we continue our testing. |
Great, lmk. Likewise, on the visual side, feel free to shout in our community slack. RE:bidirectional, a relevant feature request we've heard is exposing custom action buttons in our UI, so when embedding, you turn custom tag etc calls into an action like tagging a node in the DB . (Related, we're actively working on in-tool "grouping", such as for selecting nodes and saving as a tagged group, and "visual search", where analysts can build up pattern searches without writing cypher/aql/etc.) |
The Graphistry team is starting to get requests from Arango db users to help grow their Arango implementations + use cases, and we're wondering if there is any guidance for getting Arango to interop with the broader Apache / Python / etc. data community? Ideally, via parquet/orc (cold) or even better, apache arrow (in-memory / streaming / etc.)?
Most immediately, we're working with one team where the goal is Arango<>their Java app<>Graphistry.
V0: The no-thought solution is doing
Arango--[json/csv]-->java--[json/csv]-->graphistry
, but that means big transfers, losing existing type data, etc. On the plus side, when the customer does know the result column schema, they can send that as part of the graphistry ingest step.V1: To do better, we're thinking
Arango---[velocypack]-->Java app--[manually constructed arrow or orc typed columnar format for node+edge property tables]-->Graphistry
. Though we're unsure what such a conversion looks like, e.g., any sample VelocyPack code, and especially wrt taking type/serialization wrangling pain away from Arango users by doing automated conversions.V2: Longer term, we're thinking direct
Arango--[velocypack stream]-->graphistry REST API--[velocypack stream chunk to arrow conversion]-->graphistry internal
. Or better,Arango--[apache arrow/parquet/orc]-->Graphistry
, if on the roadmap. In both cases, no type wrangling etc. for users.Any pointers would be appreciated. As simplifying constraints, users can get a lot of mileage by limiting the initial scope to node/edge queries that return primitively typed columns (string/int/date/etc.). Long-term, for fancier nested types (json, ...), Arrow etc. ecosystem do support an increasing variety.
Thanks!
The text was updated successfully, but these errors were encountered: