rsc: update read_job transaction hit path #1671

AbrarQuazi · 2024-12-17T17:57:19Z

No description provided.

AbrarQuazi · 2024-12-17T18:04:17Z

rust/rsc/src/bin/rsc/read_job.rs

-        })
-        .await;
+    // Step 6: TODO before recording a hit and returning, need verification that objects from transaction did not change in
+    // database, or convert a hit into a miss


I have questions on how best to accomplish this verification from chaning a hit to a miss. Do I just query (output_files, output_symlinks, output_dirs) again and verify they are the same with the database? Seems kind of wasteful (what happens when another update happens while I am doing the verification equivalence check?).

Can we instead potentially rely on job::Entity's created_at field (or create a new one called updated_at) where we just quickly check timestamps are same before returning?

rust/rsc/src/bin/rsc/read_job.rs

colinschmidt · 2025-01-14T18:28:28Z

rust/rsc/src/bin/rsc/read_job.rs

+        .filter(entity::blob::Column::Id.is_in(ids.to_vec()))
+        .all(db)
+        .await
+        .map_err(|e| format!("Failed to query blobs: {}", e))?;


Hmm it seems like we could have done .await?.into_iter().map(<blob_map code from the next line)

Yep, Ill change that

colinschmidt · 2025-01-15T06:30:41Z

rust/rsc/src/bin/rsc/read_job.rs

+    // Ensure we have all requested blobs
+    for &id in ids {
+        if !blob_map.contains_key(&id) {
+            return Err(format!("Unable to find blob {} by id", id));


How does this error condition differ from the failed to query blobs check above?

So the error above checks for errors like database connection issues, and other database related errors

The error check here checks if all of the rows are returned from the query (there could be the scenario that a requested ID did not exist, so it didn't return anything for that).

Perhaps I can change the Error text for the condition above to be more like `Failed to query blobs, database error: {}" , e"

colinschmidt · 2025-01-15T06:34:18Z

rust/rsc/src/bin/rsc/read_job.rs

+    let mut resolved_map = HashMap::new();
+    for res in results {
+        let (id, resolved_blob) = res?;
+        resolved_map.insert(id, resolved_blob);
+    }


Is there no simpler constructor for this?
We only to do this iteratively because we wanted to resolve the urls in parallel, rather than with a map on the original blob_map?

So I think I can do this instead of the manual for loop:
let resolved_map: HashMap<Uuid, ResolvedBlob> = results.into_iter().collect::<Result<_,_>>()?;

Yep thats exactly why we do it, we could technically add to the hashmap parallely too but I dont think there is any benefits and adds to complexity

colinschmidt · 2025-01-15T06:35:48Z

rust/rsc/src/bin/rsc/read_job.rs

+        })
+        .await;
+
+    let hash_copy = hash_for_spawns.clone();


Why isn't this just hash.clone()?

So the reason why I can't use hash.clone is because rust complains that hash is used by the above transaction, and it is moved into the transactions closure because string does not implement the copy constructor. Because of this I need to create 2 clones of hash, one for the transaction, and one for the tokio spawns (there are 2 of them)

colinschmidt

Logic seems correct, a few error message and style nits left, but LGTM

colinschmidt · 2025-01-18T05:39:58Z

rust/rsc/src/bin/rsc/read_job.rs

+        .map(|m| {
+            let blob_id = m.blob_id;
+            let resolved_blob = resolved_blob_map.get(&blob_id).cloned().ok_or_else(|| {
+                format!("Missing resolved blob for {}", blob_id)


Do we want to track the job that had a blob fetch failure? I can see that being useful debug info when something is failing.

sounds good, will add job_id to the failure message

colinschmidt · 2025-01-18T05:42:27Z

rust/rsc/src/bin/rsc/read_job.rs

+    };
+
+    let job_id = matching_job.id;
+    let hash_copy = hash_for_spawns.clone();


Again it feels like we didn't need to create this separate hash_for_spawns variable and could have instead continued to clone hash

This is the rust compiler error I am getting for just trying to clone hash

colinschmidt

LGTM

AbrarQuazi commented Dec 17, 2024

View reviewed changes

AbrarQuazi requested a review from colinschmidt December 17, 2024 18:04

colinschmidt reviewed Jan 15, 2025

View reviewed changes

AbrarQuazi added 10 commits January 16, 2025 15:17

rsc: made read_job transaction smaller

6798995

rsc: added verification logic to hit path in read_job

135b673

rsc: fix comment

5f78b59

rsc: remove unused import

257128c

rsc: get rid of verification queries, actually not needed

8c26dbd

rsc: made it so that we resolve all blobs in one query

ed157cd

fix clang errors

9fd0467

trying older version of clang

a41823b

using clang version 18.1.3

a635a39

rsc: chunk ids in query and apply review comments

f126081

AbrarQuazi force-pushed the update-transaction-read-job branch from c33b2a7 to f126081 Compare January 16, 2025 23:17

AbrarQuazi added 2 commits January 16, 2025 15:26

undo clang changes as we pinned version of ubuntu

4849efc

missed clang file

fa77708

colinschmidt reviewed Jan 18, 2025

View reviewed changes

rsc: add job_id to error message

f6af0e7

colinschmidt approved these changes Jan 21, 2025

View reviewed changes

AbrarQuazi merged commit 6f23bbd into master Jan 21, 2025
11 checks passed

AbrarQuazi deleted the update-transaction-read-job branch January 21, 2025 23:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rsc: update read_job transaction hit path #1671

rsc: update read_job transaction hit path #1671

AbrarQuazi commented Dec 17, 2024

AbrarQuazi Dec 17, 2024

colinschmidt Jan 14, 2025

AbrarQuazi Jan 15, 2025

colinschmidt Jan 15, 2025

AbrarQuazi Jan 15, 2025

colinschmidt Jan 15, 2025

AbrarQuazi Jan 15, 2025

colinschmidt Jan 15, 2025

AbrarQuazi Jan 15, 2025

colinschmidt left a comment

colinschmidt Jan 18, 2025

AbrarQuazi Jan 21, 2025

colinschmidt Jan 18, 2025

AbrarQuazi Jan 21, 2025

colinschmidt left a comment

rsc: update read_job transaction hit path #1671

rsc: update read_job transaction hit path #1671

Conversation

AbrarQuazi commented Dec 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colinschmidt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colinschmidt left a comment

Choose a reason for hiding this comment