Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Stuck Due to Retry in Greedy Mode #5

Open
huiyeruzhou opened this issue Jun 21, 2024 · 0 comments
Open

Code Stuck Due to Retry in Greedy Mode #5

huiyeruzhou opened this issue Jun 21, 2024 · 0 comments

Comments

@huiyeruzhou
Copy link

Hi, thanks for this wonderful dataset~ I ran into some issues while running the tests.

Problem and Analysis

During the execution of the testing code, the program seemed to get stuck. While debugging, I discovered that the code seemed to be retrying the greedy chatgpt calls, this happens when extract_answer attempts to extract an answer, and the model generates a blank reply as no answer is found.

    def get_chat_response(self, prompt, temperature=0, max_tokens=256, n=1, patience=1000, sleep_time=0):
        messages = [
            {"role": "user", "content": prompt},
        ]
        payload = {"model": self.gpt_model, "messages": messages, "temperature": temperature, "max_tokens": max_tokens, "n":n}

        while patience > 0:
            patience -= 1
            try:
                response = self._post_request(payload)
                if n == 1:
                    prediction = response["choices"][0]["message"]["content"].strip()
                    if prediction and prediction != "":
                        return prediction

And in a infinite loop in score_answer, there will be a retry as long as the output is not 0/1.

            judgement = match_answer(save_inst, args.api_key, args.quick_match)
            while True:
                if judgement.strip() not in ['0', '1']:
                    print('Wrong return format: ', judgement)
                    judgement = match_answer(save_inst, args.api_key, args.quick_match)
                else:
                    save_inst['judgement'] = int(judgement)
                    break

In both cases, the number of retries is incredibly large (even infinite), and due to the greedy mode, the model will never generate the expected response. Therefore, this will stall the code and consume a lot of API quota.

I also noticed that the prompt didn't explicitly ask the model to output null when it can't extract, or to only output 0/1 without explanation when scoring, it seems few-shot example is not enough to restrict the format.

What I tried to fix

I added format requirement in the prompt:

For extract:

Directly output the extracted answer with no explanation. 

For score:

Output the Judgement (0 or 1) DIRECTLY without any explanation.

In the code:

    def get_chat_response(self, prompt, temperature=0, max_tokens=256, n=1, patience=1000, sleep_time=0):
        messages = [
            {"role": "user", "content": prompt},
        ]
        payload = {"model": self.gpt_model, "messages": messages, "temperature": temperature, "max_tokens": max_tokens, "n": n}

        while patience > 0:
            patience -= 1
            try:
                response = self._post_request(payload)
                if n == 1:
                    prediction = response["choices"][0]["message"]["content"].strip()
                    if prediction and prediction != "":
                        return prediction
                    else:
+                       if temperature == 0:
+                           # no need to retry, greedy search always return the same result
+                           return ""
            judgement = match_answer(save_inst, args.api_key, args.quick_match)
-           while True:
-               if judgement.strip() not in ['0', '1']:
+           if judgement[0] not in ['0', '1']:
-                   judgement = match_answer(save_inst, args.api_key, args.quick_match)
-               else:
-                   save_inst['judgement'] = int(judgement)
-                   break
+              print('Wrong return format: ', judgement)
+          else:
+              save_inst['judgement'] = int(judgement)

The above two issues are both alleviated.

Discuss

I am curious to know if others have encountered this issue, as I see it should be universal due to the nature of greediness. Also, I look forward to developers' feedback on my modifications, I am willing to submit a PR if my understanding and fix is correct.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant