Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsc: update read_job transaction hit path #1671

Merged
merged 13 commits into from
Jan 21, 2025

Conversation

AbrarQuazi
Copy link
Contributor

No description provided.

})
.await;
// Step 6: TODO before recording a hit and returning, need verification that objects from transaction did not change in
// database, or convert a hit into a miss
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have questions on how best to accomplish this verification from chaning a hit to a miss. Do I just query (output_files, output_symlinks, output_dirs) again and verify they are the same with the database? Seems kind of wasteful (what happens when another update happens while I am doing the verification equivalence check?).

Can we instead potentially rely on job::Entity's created_at field (or create a new one called updated_at) where we just quickly check timestamps are same before returning?

.filter(entity::blob::Column::Id.is_in(ids.to_vec()))
.all(db)
.await
.map_err(|e| format!("Failed to query blobs: {}", e))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it seems like we could have done .await?.into_iter().map(<blob_map code from the next line)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Ill change that

// Ensure we have all requested blobs
for &id in ids {
if !blob_map.contains_key(&id) {
return Err(format!("Unable to find blob {} by id", id));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this error condition differ from the failed to query blobs check above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the error above checks for errors like database connection issues, and other database related errors

The error check here checks if all of the rows are returned from the query (there could be the scenario that a requested ID did not exist, so it didn't return anything for that).

Perhaps I can change the Error text for the condition above to be more like `Failed to query blobs, database error: {}" , e"

Comment on lines 76 to 84
let mut resolved_map = HashMap::new();
for res in results {
let (id, resolved_blob) = res?;
resolved_map.insert(id, resolved_blob);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no simpler constructor for this?
We only to do this iteratively because we wanted to resolve the urls in parallel, rather than with a map on the original blob_map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think I can do this instead of the manual for loop:
let resolved_map: HashMap<Uuid, ResolvedBlob> = results.into_iter().collect::<Result<_,_>>()?;

Yep thats exactly why we do it, we could technically add to the hashmap parallely too but I dont think there is any benefits and adds to complexity

})
.await;

let hash_copy = hash_for_spawns.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this just hash.clone()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason why I can't use hash.clone is because rust complains that hash is used by the above transaction, and it is moved into the transactions closure because string does not implement the copy constructor. Because of this I need to create 2 clones of hash, one for the transaction, and one for the tokio spawns (there are 2 of them)

@AbrarQuazi AbrarQuazi force-pushed the update-transaction-read-job branch from c33b2a7 to f126081 Compare January 16, 2025 23:17
Copy link
Contributor

@colinschmidt colinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic seems correct, a few error message and style nits left, but LGTM

.map(|m| {
let blob_id = m.blob_id;
let resolved_blob = resolved_blob_map.get(&blob_id).cloned().ok_or_else(|| {
format!("Missing resolved blob for {}", blob_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to track the job that had a blob fetch failure? I can see that being useful debug info when something is failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will add job_id to the failure message

};

let job_id = matching_job.id;
let hash_copy = hash_for_spawns.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again it feels like we didn't need to create this separate hash_for_spawns variable and could have instead continued to clone hash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-01-21 at 10 40 05 AM This is the rust compiler error I am getting for just trying to clone hash

Copy link
Contributor

@colinschmidt colinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AbrarQuazi AbrarQuazi merged commit 6f23bbd into master Jan 21, 2025
11 checks passed
@AbrarQuazi AbrarQuazi deleted the update-transaction-read-job branch January 21, 2025 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants