You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I now observed many instances where a file upload ran into the (already extended) 50 s script execution timeout. Passing a multi-GB file on the application and then uploading it to the S3 storage just took too long.
One alternative could be presigned upload URLs but they have the problem that they do not allow file size or MIME type validation.
The only other way is chunked uploads which require additional processing and application logic. Thoughts:
Files are sent in chunks if their size passes a certain threshold (e.g. 100 MB). Use a chunk size of 100 MB.
Chunked files get the additional request parameters chunk_index and chunk_total, specifying the index of the currently uploaded chunk and the total number of chunks of the file.
File chunks are stored in the pending storage disk at their "normal" path but with their chunk index as suffix, e.g. my/file.jpg.0, my/file.jpg.1.
To support validation of chunked files (i.e. max_file_size and if all chunks of a file have been received), we need to introduce a new StorageRequestFile model with the attributes path, size, received_chunks, total_chunks.
Size validation: The size attribute of a file is increased for each uploaded chunk. If the size passes the max_file_size threshold (or the quota), the chunk is rejected with a failed validation and a queued job is dispatched to delete all the existing chunks of the file.
Chunk validation: The received_chunks and total_chunks attributes can tell if all chunks of a file have been received. Reject the submission of a storage request if a file has missing chunks. Also reject the upload of already received chunks (e.g. the first chunk with a different MIME type).
MIME type validation: The first chunk of a file must be uploaded first. It is checked for a valid MIME type. No other chunk is accepted until this check passed.
When a storage request with chunked files is submitted, first, an AssembleChunkedFile job is dispatched for each chunked file (see below). Once all files have been assembled, the review notification is sent do the admins.
The AssembleChunkedFile job merges the chunks of a file, uploads the complete file and deletes the chunks. It also clears received_chunks and total_chunks of the file.
Bonus: We know the size of each file, which is required to implement #8. Also:
The total size of a storage request can be shown in the admin notification, on the review view and in the list of storage requests.
The used quota display can be updated immediately in the list of storage requests.
The used quota of a user can now be determined by adding the size of all their uploaded files in the DB
After implementation:
The increased script execution timeout can be reduced to the defaults again.
Update the multipart upload configurationn of the S3 storage disk
The text was updated successfully, but these errors were encountered:
I now observed many instances where a file upload ran into the (already extended) 50 s script execution timeout. Passing a multi-GB file on the application and then uploading it to the S3 storage just took too long.
One alternative could be presigned upload URLs but they have the problem that they do not allow file size or MIME type validation.
The only other way is chunked uploads which require additional processing and application logic. Thoughts:
chunk_index
andchunk_total
, specifying the index of the currently uploaded chunk and the total number of chunks of the file.my/file.jpg.0
,my/file.jpg.1
.max_file_size
and if all chunks of a file have been received), we need to introduce a newStorageRequestFile
model with the attributespath
,size
,received_chunks
,total_chunks
.size
attribute of a file is increased for each uploaded chunk. If thesize
passes themax_file_size
threshold (or the quota), the chunk is rejected with a failed validation and a queued job is dispatched to delete all the existing chunks of the file.received_chunks
andtotal_chunks
attributes can tell if all chunks of a file have been received. Reject the submission of a storage request if a file has missing chunks. Also reject the upload of already received chunks (e.g. the first chunk with a different MIME type).AssembleChunkedFile
job is dispatched for each chunked file (see below). Once all files have been assembled, the review notification is sent do the admins.AssembleChunkedFile
job merges the chunks of a file, uploads the complete file and deletes the chunks. It also clearsreceived_chunks
andtotal_chunks
of the file.Bonus: We know the size of each file, which is required to implement #8. Also:
After implementation:
The text was updated successfully, but these errors were encountered: