Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] batch_gpt4.py parsing issue with openai's HttpxBinaryResponseContent #161

Closed
wants to merge 1 commit into from

Conversation

abzb1
Copy link
Contributor

@abzb1 abzb1 commented Jul 24, 2024

In the existing code, batch_results is being processed with json.loads().
However, batch_results is of type HttpxBinaryResponseContent from openai, and cannot be directly converted.
We should use either batch_results.content or batch_results.text.
But they are batched, so they are in jsonl format.
Using json.loads() directly on them results in the error "json.decoder.JSONDecodeError: Extra data: line 2 column 1."
Therefore, it is preferable to split the content by \n and then process each line.
This commit addresses this issue.
If there are better implementations, please let me know.

Before you open a pull-request, please check if a similar issue already exists or has been closed before.

When you open a pull-request, please be sure to include the following

  • A descriptive title: [xxx] XXXX
  • A detailed description

Thank you for your contributions!

In the existing code, batch_results is being processed with json.loads(). However, batch_results is of type HttpxBinaryResponseContent from openai, and cannot be directly converted. We should use either batch_results.content or batch_results.text, but these are in jsonl format. Using json.loads() directly on them results in the error "json.decoder.JSONDecodeError: Extra data: line 2 column 1." Therefore, it is preferable to split the content by \n and then process each line. This commit addresses this issue. If there are better implementations, please let me know.
@abzb1 abzb1 changed the title Update batch_gpt4.py Fix batch_gpt4.py for addressing parsing problem Jul 24, 2024
@abzb1 abzb1 changed the title Fix batch_gpt4.py for addressing parsing problem [Fix] batch_gpt4.py for addressing parsing problem Jul 24, 2024
@abzb1 abzb1 changed the title [Fix] batch_gpt4.py for addressing parsing problem [Fix] batch_gpt4.py parsing issue with openai's HttpxBinaryResponseContent Jul 24, 2024
@abzb1 abzb1 closed this Jul 24, 2024
@abzb1 abzb1 deleted the patch-2 branch July 24, 2024 01:30
@abzb1 abzb1 restored the patch-2 branch July 24, 2024 01:36
@abzb1 abzb1 reopened this Jul 24, 2024
@abzb1 abzb1 closed this Jul 24, 2024
@abzb1
Copy link
Contributor Author

abzb1 commented Jul 24, 2024

also need to fix error below

    file_id = self.upload_input_file(file_path)
  File "/mnt/ocr-nfsx2/public/ohs/lmms_eval/lmms-eval/lmms_eval/models/batch_gpt4.py", line 194, in upload_input_file
    response = self.client.files.create(file=file, purpose="batch")
  File "/mnt/ocr-nfsx2/public/ohs/lmms_eval/.venv/lib/python3.10/site-packages/openai/resources/files.py", line 118, in create
    return self._post(
  File "/mnt/ocr-nfsx2/public/ohs/lmms_eval/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1266, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/mnt/ocr-nfsx2/public/ohs/lmms_eval/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 942, in request
    return self._request(
  File "/mnt/ocr-nfsx2/public/ohs/lmms_eval/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1046, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.APIStatusError: <html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>cloudflare</center>
</body>
</html>```

@abzb1 abzb1 deleted the patch-2 branch July 24, 2024 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant