Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Migrate Map Functions #13047

Merged
merged 6 commits into from
Oct 24, 2024
Merged

feat: Migrate Map Functions #13047

merged 6 commits into from
Oct 24, 2024

Conversation

jonathanc-n
Copy link
Contributor

@jonathanc-n jonathanc-n commented Oct 21, 2024

Which issue does this PR close?

Closes #13021 .

Rationale for this change

What changes are included in this PR?

Migrate Map Functions

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added documentation Improvements or additions to documentation development-process Related to development process of DataFusion logical-expr Logical plan and expressions core Core DataFusion crate labels Oct 21, 2024
@jonathanc-n
Copy link
Contributor Author

This is tested with the map function, @Omega359 I am open to any changes you want to make here.


//! [`SpecialUDF`]: Special User Defined Functions

pub mod special_doc_sections {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like map is still a scalar function -- I was thinking that the DocSection was what was special (as in it would be DOC_SECTION_SPECIAL that was part of scalar_doc_sections 🤔

Copy link
Contributor Author

@jonathanc-n jonathanc-n Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb I was also thinking about that, so would the unnest function be put into it's own 'special function' page? And the map function can just be its own section under 'scalar_doc_sections'? You mentioned a special function page just for unnest in #12948

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think unnest truly is special (as it changes the schema of the output) and has its own Expr variant: https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.Expr.html#variant.Unnest

Maybe we can just have static docs for that one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is indeed just one or two functions we could just have them in the update script or in a file that is pulled into the final .md file by that script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds fine, it seems that unnest is one of the very few special functions. But i feel that it would be easier for them to just be static.

@jonathanc-n jonathanc-n deleted the add-special-page branch October 21, 2024 21:14
@jonathanc-n jonathanc-n restored the add-special-page branch October 21, 2024 21:14
@jonathanc-n jonathanc-n reopened this Oct 21, 2024
@github-actions github-actions bot removed the core Core DataFusion crate label Oct 21, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back and forth @jonathanc-n

fn get_map_doc() -> &'static Documentation {
DOCUMENTATION.get_or_init(|| {
Documentation::builder()
.with_doc_section(DOC_SECTION_SPECIAL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am missing something -- it seems like the map function should be in a "Map Functions" section to follow the existing documentation:

https://datafusion.apache.org/user-guide/sql/scalar_functions.html#map-functions

Copy link
Contributor Author

@jonathanc-n jonathanc-n Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like map is still a scalar function -- I was thinking that the DocSection was what was special (as in it would be DOC_SECTION_SPECIAL that was part of scalar_doc_sections 🤔

I think you mentioned above that you wanted the function to put in a doc_section_special here. So should I make the map function go under 'doc section map' instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say so, yes. My original thought was that unnest and make_map would be the two special functions but digging through the planner code it seems that make_map just calls into map udf (#11434) so I think the only special one is unnest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, so the direction of this pr now is to just add a static page with unnest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion -- I think there are two distinct action items:

  1. Migrate the documentation for map and related functions to a "Map Functions" section
  2. Create a new static page for "special functions" (different than Scalar Functions) where we can put unnest

Perhaps we can turn this PR into the map and related functions in a "Map Functions" section and make another PR to move the unnest docs to the new page.

I am happy to help with either

Thanks again for your help @jonathanc-n and @Omega359

@jonathanc-n jonathanc-n changed the title feat: Added Special Function Docs Page feat: Migrate Map Functions Oct 23, 2024
@github-actions github-actions bot removed the logical-expr Logical plan and expressions label Oct 23, 2024
@jonathanc-n
Copy link
Contributor Author

I migrated the map functions, however it should be noted that i had combined make_map and map since they were under the same udf. Is there a better way to do this?

@alamb
Copy link
Contributor

alamb commented Oct 23, 2024

I migrated the map functions, however it should be noted that i had combined make_map and map since they were under the same udf. Is there a better way to do this?

No -- I think this is the best way actually as they really are the same they should be listed as aliases

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jonathanc-n -- this looks great to me now

Thank you for bearing with us

@alamb alamb merged commit ac827ab into apache:main Oct 24, 2024
26 checks passed
@alamb
Copy link
Contributor

alamb commented Oct 24, 2024

🚀 lets keep things moving

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of DataFusion documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate all documentation for all map functions from scalar_functions.md to code
3 participants