Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bulk actions #28

Merged
merged 4 commits into from
Feb 13, 2024
Merged

Use bulk actions #28

merged 4 commits into from
Feb 13, 2024

Conversation

dustinblack
Copy link
Member

Changes introduced with this PR

Redesigning the main operational construct to use bulk actions instead of discrete indexing operations.


By contributing to this repository, I agree to the contribution guidelines.

@dustinblack dustinblack force-pushed the use-bulk-actions branch 2 times, most recently from 25d5dc5 to 70b5963 Compare December 20, 2023 17:47
Automatic upate of README.md by arcaflow-docsgen arcabot

fix http authentication
add tests for conversion functions

linting

pass through dicts in list conversion function
enable serialization tests

schema tweaks

implement generator for ingesting the bulk list

correct error capture output

cleanup

update integration tests

changing bulkuploadobject operation dict key temporarily to a string

linting

Automatic upate of README.md by arcaflow-docsgen arcabot

only append metadata if it exists

enable multi-arch build

add plugin step to pre-process a list of data into a bulk_upload_list

Automatic upate of README.md by arcaflow-docsgen arcabot

add step definition to docker-compose action for build
@dustinblack dustinblack marked this pull request as ready for review February 8, 2024 16:38
@dustinblack dustinblack requested a review from a team February 8, 2024 16:38
Copy link

@jaredoconnell jaredoconnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested, but other than one minor comment, looks good.

typing.Optional[typing.Dict[str, typing.Any]],
schema.name("metadata"),
schema.description(
"Optional global metadata object that will be added " "to every document."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the linter removed the code wrap.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a way to name the metadata first-level key instead of just hardcoding "metadata"? For example, {"@metadata" {...}}? You could do this easily either by merging your value here directly into the document with doc.update(params.metadata) or with a separate metadata_key parameter which could even default to "metadata" and doc[params.metadata_key] = params.metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbutenhof I guess I'm not following. The schema.name is part of the self-documenting feature of the plugin. Often the name simply matches the key, but not always.

typing.Optional[typing.Dict[str, typing.Any]],
schema.name("metadata"),
schema.description(
"Optional global metadata object that will be added " "to every document."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a way to name the metadata first-level key instead of just hardcoding "metadata"? For example, {"@metadata" {...}}? You could do this easily either by merging your value here directly into the document with doc.update(params.metadata) or with a separate metadata_key parameter which could even default to "metadata" and doc[params.metadata_key] = params.metadata.

Copy link
Contributor

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I started this review yesterday and then "lost" it, so I'm posting what I have so far, and perhaps I'll get back to it on Monday (if it is still open, then).

Comment on lines +28 to +30
bulk_upload_list = []
for item in params.data_list:
bulk_upload_list.append(BulkUploadObject(params.operation, item))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively,

    bulk_upload_list = [BulkUploadObject(params.operation, i) for i in params.data_list]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me the longer way is a bit more readable. Is there an advantage here beyond tidiness?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two:

  • doing it all in one line (provided that the line does not wrap) means that more lines fit on a screen which makes it easier for the reader to comprehend the code (i.e., the reader should be able to move his/her eyes and see the whole functional unit without having to move the scroll bar); and,
  • the "list comprehension" is scoped and bounded, such that none of the variables escape the expression and none of the operations have side-effects (unless they are a call to a function whose purpose is to have side-effects...), which means that the reader can safely consider the comprehension to be a "black box" if that suits his/her purposes.

arcaflow_plugin_opensearch/opensearch_plugin.py Outdated Show resolved Hide resolved
arcaflow_plugin_opensearch/opensearch_plugin.py Outdated Show resolved Hide resolved
arcaflow_plugin_opensearch/opensearch_plugin.py Outdated Show resolved Hide resolved
Comment on lines +96 to +98
ids = []
for i in resp["items"]:
ids.append(list(i.values())[0]["_id"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternately,

        ids = [next(iter(i.values()))["_id"] for i in resp["items"]]

@dustinblack dustinblack merged commit 8500053 into main Feb 13, 2024
3 checks passed
@webbnh webbnh deleted the use-bulk-actions branch August 26, 2024 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants