Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StorageClient.UploadObject does not consistently throw an exception or guarantee file upload success #14035

Closed
yuceadnan opened this issue Dec 27, 2024 · 6 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. needs more info This issue needs more information from the customer to proceed. priority: p3 Desirable enhancement or fix. May not be included in next release. type: question Request for information or clarification. Not an issue.

Comments

@yuceadnan
Copy link

The StorageClient.UploadObject method in the Google Cloud Storage .NET library does not consistently throw an exception when the upload process somehow fails. Occasionally, the method does not throw an exception, but the file is not present in the bucket.

Expected Behavior: The UploadObject method should either:

  • Throw an exception when the upload process fails.
  • Provide a reliable way to confirm whether the upload was successful.

Steps to Reproduce:

Use the following code snippet:

public CommentResultModel UploadComment(string comment)
{
    var contentType = "text/plain";
    var bucketName = "bucket";
    var uniqueFileId = Guid.NewGuid();
    var commentFilePath =CommentsPath(uniqueFileId.ToString();
    using (var ms = new MemoryStream())
    using (var tw = new StreamWriter(ms))
    {
        tw.Write(comment);
        tw.Flush();
        ms.Position = 0;
        _cloudStorageService.UploadFile(bucketName, commentFilePath, contentType, ms);
    }
            
    //Save comment history
    _repository.AddCommentHistory(uniqueFileId);
    
    return new CommentResultModel
    {
        UniqueFileId = uniqueFileId,
        BucketName = bucketName,
        FilePath = commentFilePath
    };
}

I have a .NET Standard 2.0 library that abstracts the functionality of StorageClient. Below are the details of the _cloudStorageService.UploadFile method.

public void UploadFile(string bucketName, string fileName, string contentType, Stream content)
{
    _storageClient.UploadObject(bucketName, fileName, contentType, content); 
}

Is there a scenario where the UploadObject method does not throw an exception even if the file upload fails? Can we ensure that the file was uploaded successfully by checking specific properties in the return model of UploadObject?
For example:

public void UploadFile(string bucketName, string fileName, string contentType, Stream content)
{
    Google.Apis.Storage.v1.Data.Object object = _storageClient.UploadObject(bucketName, fileName, contentType, content);
    if (object == null || object.MediaLink == null)
    {
        throw new Exception("File upload failed. The file '" + fileName + "' was not written to the bucket '" + bucketName + "'.");
    }
}

Does the approach above guarantee the detection of failed uploads? Or are there cases where additional validation is needed to ensure the file exists in the bucket?

Environment details

  • Programming language: C#
  • OS: Windows
  • Language runtime version: net framework 4.6.1
  • Package: Google.Cloud.Storage.V1
  • Package version: 3.7.0

Thanks!

@jskeet
Copy link
Collaborator

jskeet commented Dec 27, 2024

This is a drive-by comment as all the maintainers are on vacation until January - but I don't think I've ever observed this. How often are you seeing it, and are there any patterns you've seen?

This is definitely not the expected behavior - I've seen uploads fail with an exception before, but I've never seen a case where the call has completed without an exception but without the object being persisted in GCS. When we're back in January I'm sure we'll be happy to help diagnose the problem, but we'll need more information.

(As an aside, a simpler way to obtain the stream containing one string would just be new MemoryStream(Encoding.UTF8.GetBytes(comment)) - and there's no need to dispose of that stream afterwards. Just a way to simplify your code a bit.)

@jskeet jskeet added the needs more info This issue needs more information from the customer to proceed. label Jan 2, 2025
@yuceadnan
Copy link
Author

yuceadnan commented Jan 3, 2025

@jskeet Thanks for the suggestion. Here are more details about the case:
I haven’t noticed any specific patterns, but it happens very rarely. Let me explain how I implemented the StorageClient in detail:
As mentioned, I have a .NET Standard 2.0 library that abstracts the functionality of StorageClient. I also use KmsClient to create encrypted buckets.
Service Details:
CloudStorageService is configured to operate in a Singleton scope

public class CloudStorageService : ICloudStorageService
 {
     private readonly StorageClient _storageClient;
     public CloudStorageService(CloudStorageSettings settings)
     {
         GoogleCredential credential = GoogleCredential.FromFile(settings.ServiceAccountKeyPath);
         _storageClient = StorageClient.Create(credential);
     }  
    //Methods
    public void UploadFile(string bucketName, string fileName, string contentType, Stream content)
    {
        var uploadedObject =  _storageClient.UploadObject(bucketName, fileName, contentType, content);
        if (uploadedObject == null || uploadedObject.MediaLink == null)
            throw new Exception($"File upload failed. The file '{fileName}' was not written to the bucket '{bucketName}'.");
    }
}

@jskeet
Copy link
Collaborator

jskeet commented Jan 3, 2025

Rather than show more details of your implementation, it would be really helpful to see a minimal but complete repro. I doubt that KMS is needed to do that - although if it is, just in terms of creating an encrypted bucket, presumably that could be done manually as part of preparing for the repro, to reduce the amount of code required.

Note that UploadObjectAsync will never return null - looking at the implementation a few levels down at https://github.com/googleapis/google-cloud-dotnet/blob/main/apis/Google.Cloud.Storage.V1/Google.Cloud.Storage.V1/StorageClientImpl.UploadObject.cs#L203, the code would throw a NullReferenceException before returning the null reference.

Can you be more specific about the frequency than "happens rarely"? What proportion - one in a hundred uploads, one in thousand, one in a million? How are you detecting this afterwards? (It still feels more likely that this is a diagnostic problem than one in the client library, but without more information it will be hard to pin that down.)

@amanda-tarafa amanda-tarafa assigned jskeet and unassigned amanda-tarafa Jan 6, 2025
@amanda-tarafa amanda-tarafa added api: storage Issues related to the Cloud Storage API. type: question Request for information or clarification. Not an issue. priority: p3 Desirable enhancement or fix. May not be included in next release. labels Jan 6, 2025
@yuceadnan
Copy link
Author

@jskeet I’ve updated the example code block. I compared the saved comments history with the list of objects from the bucket, and there are only 2 missing items out of 15,000.

How can I be sure about this? As mentioned earlier, I save a history record immediately after calling _cloudStorageService.UploadFile. If the storage client had thrown an exception, there wouldn’t be a history record for that item.

By the way, I’m using the synchronous method instead of the async one. I believe there’s no significant difference between them in this context.

@jskeet
Copy link
Collaborator

jskeet commented Jan 7, 2025

That's still not a complete program though - not something I can run.

Here's an example of a genuinely complete program, which upload a set number of objects and then counts them (expecting the bucket to be empty to start with):

using Google.Cloud.Storage.V1;
using System.Text;

if (args.Length != 2)
{
    Console.WriteLine("Arguments: <bucket> <count>");
    return 1;
}

TimeSpan progressInterval = TimeSpan.FromSeconds(5);

string bucket = args[0];
int count = int.Parse(args[1]);

var client = StorageClient.Create();

var nextProgress = DateTime.MinValue;

Log("Uploading objects.");
for (int i = 0; i < count; i++)
{
    var now = DateTime.UtcNow;
    if (now > nextProgress)
    {
        Log($"Uploading object {i}");
        nextProgress = now + progressInterval;
    }
    string name = $"object-{i:D7}";
    var data = new MemoryStream(Encoding.UTF8.GetBytes(name));
    client.UploadObject(bucket, name, "text/plain", data);
}

Log("All objects uploaded. Listing objects.");

var actualCount = client.ListObjects(bucket).Count();
Log($"Uploaded {count} objects. Listing found {actualCount} objects");

return 0;

void Log(string message) =>
    Console.WriteLine($"{DateTime.UtcNow:HH:mm:ss}: {message}");

I've just run that, uploading 100,000 objects, and it didn't miss any. If you run the same code, do you observe any missing objects? Basically it's crucial that we can get to the stage where I can run the same code that reproduces the issue for you - I don't mind whether that's an adaptation of your code, or code that I've come up with, but I've got to be able to run the same code.

@yuceadnan
Copy link
Author

The issue is that I can't share a repeating code block for the problem. As you mentioned, the issue might be related to the implementation. When I compare it with the example you shared, the only difference is that I pass the credential as a parameter while creating the StorageClient. I don't think this makes a difference either. I'll close the topic for now and revisit it if I have concrete evidence in the future. Thank you for your support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. needs more info This issue needs more information from the customer to proceed. priority: p3 Desirable enhancement or fix. May not be included in next release. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants