-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: json datatype rfc #4515
docs: json datatype rfc #4515
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cool to me!
LGTM, @WenyXu PTAL |
I came up with two scenarios of how JSON would be used by end users:
For the first case, we can suggest users decompose the JSON object into table columns. Either by themselves (predefined table schema) or by GreptimeDB (gRPC auto alter or through pipeline). So from the DB side, it's just a wide table with several atomic types. And is not distinguishable from other tables. Advantages like high compression ratio or SQL interface are available out-of-box. The second case on the contrary, requires a fully dynamic ability to JSON data. We'd prefer to handle the "JSON part" on read (like decoding and/or field accessing) to minimize the affection to write performance. This choice is based on the hypothesis that data without known schema is unlikely to be queried frequently. Reflecting to the implementation phase, the need & no-need to are:
|
Thanks for all suggestions! I've changed the rfc to a relatively simple jsonb plan which is a first consensus for general json data type. It's not a detailed plan. I think I can first make some efforts in the direction and then get back and refine this rfc. |
For those "util functions", we can use https://github.com/datafusion-contrib/datafusion-functions-json which provides an out-of-box implementation. As well as the binary representation, JSONA. |
This crate just provides functions for querying JSON strings. |
9b2a311
to
e2d3605
Compare
I happens to trigger some auto-review-request because of misoperation QAQ, sry for that. |
c9b38e6
to
22c2012
Compare
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
#4230
What's changed and what's your intention?
RFC of json datatype.
Checklist