-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: pptx uploaded on google docs #540
Conversation
I think we talked about exporting the file as presentation in the first place here gooey-server/daras_ai_v2/gdrive_downloader.py Lines 159 to 161 in 66a05d7
|
Ok nevermind, I don't think that works for pptx files in slides. But we can be smart about it by looking at the mime type returned by |
In addition, we can also add |
510dd9e
to
855adce
Compare
daras_ai_v2/gdrive_downloader.py
Outdated
|
||
|
||
def service_request( | ||
service, file_id: str, f: furl, mime_type: str, retried_request=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unused param
daras_ai_v2/gdrive_downloader.py
Outdated
|
||
request, mime_type = service_request(service, file_id, f, mime_type) | ||
file_bytes, mime_type = download_blob_file_content( | ||
service, request, file_id, f, mime_type, export_links |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the download_blob_file_content to be a separate function here
daras_ai_v2/gdrive_downloader.py
Outdated
def download_from_exportlinks(f: furl) -> bytes: | ||
try: | ||
r = requests.get(f) | ||
f_bytes = r.content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use raise_for_status() like we do everywhere else in code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this likely doesnt support downloading private docs without auth, I may be wrong
daras_ai_v2/gdrive_downloader.py
Outdated
if ( | ||
mime_type | ||
== "application/vnd.openxmlformats-officedocument.presentationml.presentation" | ||
): | ||
# logger.debug(f"Downloading {str(f)!r} using export links") | ||
f_url_export = export_links.get(mime_type, None) | ||
if f_url_export: | ||
|
||
f_bytes = download_from_exportlinks(f_url_export) | ||
else: | ||
request = service.files().get_media( | ||
fileId=file_id, | ||
supportsAllDrives=True, | ||
) | ||
downloader = MediaIoBaseDownload(file, request) | ||
|
||
done = False | ||
while done is False: | ||
_, done = downloader.next_chunk() | ||
# print(f"Download {int(status.progress() * 100)}%") | ||
f_bytes = file.getvalue() | ||
|
||
else: | ||
done = False | ||
while done is False: | ||
_, done = downloader.next_chunk() | ||
# print(f"Download {int(status.progress() * 100)}%") | ||
f_bytes = file.getvalue() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire logic looks a bit repetitive. You wanna look at the diff really closely and figure out if the changes make sense.
I think you only need to change how the export google docs part was working:
But I can't really make sure by looking at the multiple diverging code paths here. The export_media() option is not needed if you use exportLinks I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah! , yes this make way more sense, i wanted to only use export_links for pptx without breaking the existing logic hence the mess.
Q/A checklist
You can visualize this using tuna:
To measure import time for a specific library:
To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:
Legal Boilerplate
Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.