Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload New AppData to IPFS #2583

Closed
wants to merge 2 commits into from

Conversation

bh2smith
Copy link
Contributor

@bh2smith bh2smith commented Mar 31, 2024

This PR closes #2302 by implementing a post_app_data method on the IpfsAppData structure (which internally calls a newly implemented Ipfs.add. This is just a raw http-request POST request to this Pinata Endpoint.

This is done by introducing a structure with correctly formatted payload that is accepted by the API endpoint (see unit tests for the specific format). Constructor method are in place to manufacture the specific format and all possible routes are tested (however only the expected payload format is included in a test that isn't ignored).

Notes on Decisions Made

  1. At first, I considered using the ipfs-api crate for this, but the use case was a bit too "niche" with the CID being specifically built on the appDataHash.
  2. I also looked into @vkgnosis's previous work on the matter here, but it only pins the CID.
  3. Still not sure exactly which endpoint was supposed to be used (e.g. PinFileToIPFS, PinJSON, but chose PinByCID). The code introduced here results in a successful call to the endpoint, but all the posts are still in a "Pending State" of "searching".... as if they don't exist.

⚠️ This endpoint is tightly bound to https://www.pinata.cloud/ who seem to have these "proprietary" endpoints. I looked into using Infura, but apparently their IPFS endpoint is "temporarily down".

cc @fleupold (as the creator or the related issue).

@bh2smith bh2smith requested a review from a team as a code owner March 31, 2024 18:02
Copy link

github-actions bot commented Mar 31, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@bh2smith
Copy link
Contributor Author

I have read the CLA Document and I hereby sign the CLA

@@ -18,7 +18,7 @@ pub struct ValidatedAppData {
pub protocol: ProtocolAppData,
}

#[derive(Clone, Debug, Default, Deserialize, Eq, PartialEq)]
#[derive(Clone, Debug, Default, Deserialize, Eq, PartialEq, Serialize)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were added across a bunch of structures in order to "Serialize" the AppData structure as JSON for post.

@@ -60,7 +60,13 @@ impl Registry {
.insert_full_app_data(&validated.hash, &validated.document)
.await
{
Ok(()) => Ok((Registered::New, validated.hash)),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intercept the successful DB insertion to post to IPFS.

Ok(()) => {
// Post New AppData to IPFS after successful insert.
if let Some(ipfs) = &self.ipfs {
let _ = ipfs.post_app_data(validated.clone()).await;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably make use of the result here, but didn't have any thoughts on how you would like to do it. Maybe we can move the logging out of the inner methods to here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case this operation is not going to be executed atomically with the DB insert, how about running the IPFS request asynchronously(probably, with a retry ability)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion sounds a bit like making use of this existing project https://github.com/cowprotocol/ipfs-block-put which can be running as a separate service whose only responsibility receive upload requests and handle them. Then here, we would just send out our requests to upload and forget about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that using the block put service optimistically (ie in a fire/forget manner) make sense for implementing this.

It looks like ipfs-block-put doesn't currently support retry nor prometheus metrics, which would be needed though to ensure we can have proper alerting in case it breaks.

pub id: String,
pub ipfs_hash: String,
pub status: String,
pub name: Option<String>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is entirely unused by this code change and could be left out. Could use the hex appdata hash as the name.

Comment on lines -56 to +74
if let Some(query) = &self.query {
url.set_query(Some(query.as_str()));
if let Some(jwt) = &self.auth_token {
url.set_query(Some(&format!("pinataGatewayToken={}", jwt)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subtle change allows the auth token to be used for the private gateway both for fetching and for uploading. The way this was formatted before was bound to the "fetch" endpoint.

auth_token: Option<String>,
}

#[derive(Serialize, Deserialize, Debug, PartialEq)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the request need to be deserialized on our end?


#[derive(Serialize, Deserialize, Debug, PartialEq)]
#[serde(rename_all = "camelCase")]
pub struct PinByHashRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public visibility not need I believe

pub pinata_metadata: String,
}

#[derive(Debug, Serialize, Deserialize, PartialEq)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the response need to be serializable?

}

impl Ipfs {
pub fn new(client: ClientBuilder, base: Url, query: Option<String>) -> Self {
pub fn new(client: ClientBuilder, base: Url, auth_token: Option<String>) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is coupling the IPFS component too tightly to Pinata imo. Before it would work with any gateway, now it only works with Pinata.

Ok(()) => {
// Post New AppData to IPFS after successful insert.
if let Some(ipfs) = &self.ipfs {
let _ = ipfs.post_app_data(validated.clone()).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that using the block put service optimistically (ie in a fire/forget manner) make sense for implementing this.

It looks like ipfs-block-put doesn't currently support retry nor prometheus metrics, which would be needed though to ensure we can have proper alerting in case it breaks.

Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly concerned with ensuring that what we pin on IPFS actually ends up with the same hash we have in our DB.

pub async fn add(&self, data: PinByHashRequest) -> Result<PinResponse> {
let url = format!("{}pinning/pinByHash", self.base);
// Prevent upload attempt when `auth_token is None`.
let jwt = if let Some(jwt) = &self.auth_token {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should be equivalent to sth like:

let jwt = self.auth_token.context("Can't post without valid JSON web token")?;

Ok(response)
} else {
Err(anyhow!(
"Error posting AppData to IPFS: {:?}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be nice to get the status code in the error.

if resp.status().is_success() {
let data = resp.text().await?;
let response: PinResponse = serde_json::from_str(&data)?;
tracing::debug!("AppData pinned to IPFS with CID: {}", response.ipfs_hash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We usually try to avoid dynamic log strings to make searching more consistent.

Suggested change
tracing::debug!("AppData pinned to IPFS with CID: {}", response.ipfs_hash);
tracing::debug!(cid = response.ipfs_hash, "AppData pinned to IPFS");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I had been wondering about what that was all about!

Comment on lines +156 to +157
// Format of pin_metadata this MUST BE as follows:
// {"keyvalues":{"appData":"{\"appCode\":\"CoW Swap\"}"}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what actually needs to be stored is just the value of appData.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried... they expect a key. Then we would have to make every inner json document an escaped string.

@@ -118,6 +119,26 @@ impl IpfsAppData {
metric.with_label_values(&[outcome(&result), "node"]).inc();
Ok(result)
}

pub async fn post_app_data(&self, data: ValidatedAppData) -> Result<PinResponse> {
tracing::debug!("posting appData to IPFS {:?}", data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracing::debug!("posting appData to IPFS {:?}", data);
tracing::debug!(?data, "posting to IPFS");

Comment on lines +132 to +134
let parsed_json: Value = serde_json::from_str(&value.document).unwrap();
// build escaped json string.
let escaped_json_string = serde_json::to_string(&parsed_json.to_string()).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since already have a ValidatedAppData we should no longer have to parse the content to verify it's valid JSON.
Also we should use the value.document as is instead of deserializing and re-serializing since this might change its representation which can result in a different CID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem with document is that it's not escaped properly for ipfs api.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, strange that they require a specific format for the content. Btw when looking into pinByHash they say it only pins document that have already been uploaded to IPFS. It looks like what we'd actually want to use is pinFileToIpfs to actually make pinata upload the file.
Or alternatively as discussed in some other comment using https://github.com/cowprotocol/ipfs-block-put should also work with the raw strings.

Copy link
Contributor Author

@bh2smith bh2smith Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya maybe I misunderstood what uploading a file to ipfs is. I guess pinning is not the same thing. Or needs to be done after. My thought was that pin by hash would upload the document and link it to our specific cid.

Anyway, I can't touch this again till the weekend and it seems to be a whole lot more involved than I was expecting.

Copy link

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

@github-actions github-actions bot added the stale label Apr 10, 2024
@mfw78 mfw78 removed the stale label Apr 12, 2024
Copy link

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

@github-actions github-actions bot added the Stale label Apr 20, 2024
@github-actions github-actions bot closed this Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Upload AppData pre-images to IPFS
5 participants