[Feature Request]: Enhanced Label Matching #708

haranrk · 2024-02-08T12:10:45Z

Is your feature request related to a problem? Please describe.
Currently, the autolabel function performs a strict comparison between the output generated by the language model and a predefined list of potential labels. This approach leads to the issuance of an OUTPUT_GUIDELINES_NOT_FOLLOWED_ERROR when the generated output closely aligns with a label but does not match it exactly. For instance, when using the Llama-7B model with the banking dataset, outputs such as "Sure! Here is the label for your input: Input: I want to close my account. Output: terminate_account" are generated, which, despite the llm having guessed the correct label, do not strictly match any label, thus triggering an error.

Describe the solution you'd like
To address this issue, I propose providing the option to choose a more flexible label comparison mechanism while running a labelling task. The following are some flexible label comparison mechanisms:

Inclusion Check: Evaluate whether any of the predefined labels are contained within the language model's output. If a single label is found within the output, it should be designated as the generated label. However, this approach may be less effective with commonly used words as labels due to the risk of false positives.
Similarity Assessment: Utilizing a similarity metric, such as the ROUGE Score, could offer a more nuanced evaluation of the relationship between the model's output and the potential labels. The label with the highest similarity score would be deemed the most appropriate. This approach should only categorize an output as successfully labeled if the top similarity score significantly surpasses a set threshold or is distinctly higher than other scores.

The second approach significantly enhanced classification accuracy in a project where I fine-tuned Llama-2-70B for categorizing physicians' diagnostic notes into specific types of cancer. Given the complexity and length of oncological terminology, the model was prone to minor spelling inaccuracies in its classifications. Implementing this method resulted in a marked improvement in the precision of the model's categorizations.

The text was updated successfully, but these errors were encountered:

haranrk added the enhancement New feature or request label Feb 8, 2024

vaibagra self-assigned this Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Enhanced Label Matching #708

[Feature Request]: Enhanced Label Matching #708

haranrk commented Feb 8, 2024

[Feature Request]: Enhanced Label Matching #708

[Feature Request]: Enhanced Label Matching #708

Comments

haranrk commented Feb 8, 2024