Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Add text tagging by prompt mapper op #408

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

garyzhang99
Copy link
Collaborator

@garyzhang99 garyzhang99 commented Aug 30, 2024

As the title says.
Use llm to perform arbitrary tagging and classification.

  • Also adding a filter op, like video_tagging_from_frames_filter

@garyzhang99 garyzhang99 added the dj:op issues/PRs about some specific OPs label Aug 30, 2024
@garyzhang99 garyzhang99 self-assigned this Aug 30, 2024
@garyzhang99 garyzhang99 requested a review from yxdyc August 30, 2024 03:39
skip_special_tokens=True)

text_tags = []
text_tags.append(output)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to strip output

""" # noqa

super().__init__(*args, **kwargs)
self.num_proc = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If enable_vllm is False, num_proc=1 will force this OP to run in single process/GPU. Is this the desired behavior?


text_tags = []
text_tags.append(output)
sample[Fields.text_tags] = text_tags
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to #423 for adding user-specified tag field name.

if not string_list:
assert False, "输入的列表不能是空的"

for string in string_list:
Copy link
Collaborator

@drcege drcege Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not directly check output in string_list?

max_model_len=1024,
max_num_seqs=16,
sampling_params={'temperature': 0.1, 'top_p': 0.95, 'max_tokens': 256})

Copy link
Collaborator

@drcege drcege Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this OP run in multiple processes, especially without vllm? Please add more tests.

max_num_seqs=16,
sampling_params={'temperature': 0.1, 'top_p': 0.95, 'max_tokens': 256})


Copy link
Collaborator

@drcege drcege Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to add tests for tensor_parallel_size.

Copy link

github-actions bot commented Oct 3, 2024

This PR is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this PR will be closed in 3 day.

Copy link

github-actions bot commented Oct 7, 2024

Close this stale PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dj:op issues/PRs about some specific OPs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants