Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add fields_metadata to SegmentReader, add columnar docs #2222

Merged
merged 8 commits into from
Nov 22, 2023
Merged

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Oct 19, 2023

No description provided.

@PSeitz PSeitz requested a review from fulmicoton October 19, 2023 10:49
@PSeitz PSeitz force-pushed the list_fields branch 2 times, most recently from 6d9d17a to 35da586 Compare November 15, 2023 07:11
@PSeitz PSeitz requested a review from fulmicoton November 16, 2023 02:40
let mut out = Vec::new();
for (_key, mut group) in &input
.into_iter()
.kmerge_by(|a, b| a < b)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left / right

Suggested change
.kmerge_by(|a, b| a < b)
.kmerge_by(|a, b| a < b)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you. remove the Ord implementation of FieldMetadata, create a function extracting the key you want (returning the pair field name field type ).
Put that in that file, and call it here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have the Ord implementation, so it will behave consistent with sort_by etc.

impl Ord for FieldMetadata {
fn cmp(&self, other: &Self) -> Ordering {
match self.field_name.cmp(&other.field_name) {
Ordering::Equal => self.typ.cmp(&other.typ),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid inconsistent Ord / PartialOrd / Eq/ PartialEq / Hash implementation. This is a very common footgun.
(Here your Ord impl is inconsistent with Eq)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch to a derive with a comment to not reorder

Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main comment is to remove the PartialOrd / Ord, but please have a look at the other comments too.

let mut out = Vec::new();
for (_key, mut group) in &input
.into_iter()
.kmerge_by(|a, b| a < b)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you. remove the Ord implementation of FieldMetadata, create a function extracting the key you want (returning the pair field name field type ).
Put that in that file, and call it here instead?

src/core/segment_reader.rs Outdated Show resolved Hide resolved
src/core/segment_reader.rs Show resolved Hide resolved
src/core/segment_reader.rs Outdated Show resolved Hide resolved
src/core/segment_reader.rs Outdated Show resolved Hide resolved
schema: &Schema,
) -> Vec<FieldMetadata> {
// Maybe too slow for the high cardinality case
fn is_field_stored(field_name: &str, schema: &Schema) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Can we have the function outside of this?

I personally find it less readable.
I assume your point is that you want to namespace the function as much as possible.
One trouble with these is that it makes it non trivial to know whether they capture their environment or not.

So answering the question, "what does thjis function does" actually needs to look at its impl, as opposed to looking at the prototype of the function. It could hide any surprises.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fn can't capture anything?
I moved it outside

@PSeitz PSeitz merged commit 1a9fc10 into main Nov 22, 2023
4 checks passed
@PSeitz PSeitz deleted the list_fields branch November 22, 2023 11:29
PSeitz added a commit that referenced this pull request Apr 10, 2024
* add fields_metadata to SegmentReader, add columnar docs

* use schema to resolve field, add test

* normalize paths

* merge for FieldsMetadata, add fields_metadata on Index

* Update src/core/segment_reader.rs

Co-authored-by: Paul Masurel <[email protected]>

* merge code paths

* add Hash

* move function oustide

---------

Co-authored-by: Paul Masurel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants