Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP-425: Signaling Features on HTTP Gateways #425

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions src/http-gateways/path-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,46 @@ Optional, present in certain response types:
non-executable binary response types are not used in `<script>` and `<style>`
HTML tags.

### `Ipfs-Gateway-Features` (response header)

Optional, this header SHOULD be only returned in response to HTTP `OPTIONS` request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just tuning into this PR now that I care about signalling, sorry!)

Whatever this header is, I think it should also be returned when a server doesn't implement a feature requested, either via Accept header or other means and in that case the server should return a 415. That way I can parse the list of Accept content types, and be strict about the parameters, and if I don't find one that I fully support, return a 415 with this signal that says what I do support.

That way, a client can optimistically make a request, but be prepared to downgrade if it doesn't get what it wants. Rather than needing to OPTIONS first, thereby saving a round-trip.


The value is a list of key-value pairs, as specified by the 5.6.1. Lists section of :cite[rfc9110].

Each feature is indicated by a key and optional value. When more than one value is supported, `|` is used as a separator.
For example:

```
Ipfs-Gateway-Features: foo, bar=1, buzz=a|b|c
```

A Gateway SHOULD use this header to communicate support for specific Gateway feeatures, enabling clients to make better decisions on how to retrieve data.

A Gateway MAY define and return their own features.

A Client MUST send HTTP OPTIONS request to inspect this header before performaning any more expensive feature-detection.

#### Canonical Ipfs-Features values

- `trustless-gateway` for :cite[trustless-gateway]
- `path-proof` indicates support for returning parent blocks up to the terminus element
Comment on lines +566 to +567
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `trustless-gateway` for :cite[trustless-gateway]
- `path-proof` indicates support for returning parent blocks up to the terminus element
- `trustless-gateway` for :cite[trustless-gateway]
- `formats=car|raw` indicates whether CAR and/or raw block responses are supported
- `path-proof` indicates support for returning parent blocks up to the terminus element

- `car-version=1|2` indicates CAR support
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `car-version=1|2` indicates CAR support
- `car-version=1` indicates CAR support

we should strictly do v1 for now, v2 doesn't make sense for this (yet), but it might be worth signalling this so we can also add 3 etc. when we get to it

- `dag-scope=block|entity|all` from :cite[ipip-0402]
- `entity-bytes` from :cite[ipip-0402], implies support for `dag-scope=entity` as well
Comment on lines +566 to +570
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannahhoward @alanshaw @olizilla I know you want a way for signaling support for partial cars and range requests between Lassie and .storage to avoid expensive feature-sniffing.

With this IPIP, signaling support for partial cars WITHOUT range requests woudl look like this:

Ipfs-Gateway-Features: dag-scope=block|entity|all

and if you add support for entity-bytes at some point:

Ipfs-Gateway-Features: dag-scope=block|entity|all, entity-bytes

- `car-block-order=dfs` from :cite[ipip-0412]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `car-block-order=dfs` from :cite[ipip-0412]
- `order=dfs` from :cite[ipip-0412]

- `car-block-dupes=y|n` from :cite[ipip-0412]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `car-block-dupes=y|n` from :cite[ipip-0412]
- `dups=y|n` from :cite[ipip-0412]

- `path-gateway` for deserialized responses defined by :cite[path-gateway]
- `subdomain-gateway=example.com` for :cite[subdomain-gateway] support based on `Host` header
- `dnslink-gateway` for :cite[dnslink-gateway] support based on `Host` header
- `ipns` indicating :cite[ipns-record] support on `/ipns/` content paths
- `dnslink` indicating [DNSLink](https://dnslink.dev) support on `/ipns/` content paths

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that could be useful is max block size:

Suggested change
- `max-block-size` to indicate what is the biggest block that can be retrieved by this gateway (default: 1MiB? 2MiB? -- whatever we currently have in boxo/gateway)

Copy link
Member Author

@lidel lidel Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing we could signal: if gateway is recursive (fetching content if not available locally) or not (cc @aschmahmann @Jorropo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea but hold off until we have a real use-case for it reckon - unless you can conjure one now?

<!-- TODO do we want these too?
- `multibase` list of prefixes, indicates which multibase encoding are supported in CIDs
- `multihash` indicates which hash functions are supported in CIDs
- `multicodec` indicates which codecs are supported in CIDs
-->
Comment on lines +579 to +583
Copy link
Member Author

@lidel lidel Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on adding these or even more?

The need for list of supported IPLD codecs was raised by ipfs-chromium and Capyloon in #402 (comment), having the other two would not hurt too.

My only constraint would be to use numeric (decimal or hex) codes, to avoid problems if we ever do this again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @aschmahmann as you noted in 402:

Not specifically related to the trustless gateway, but if we're adding OPTIONS support we might want to be able to discover things like supported hash functions and IPLD codecs.

Would below work? is there a better way of representing this?

multihash=0x12|0x13|etc # identity + goodset from https://github.com/ipfs/boxo/blob/cfad09d7156efa2f09822d620cacb2423d884067/verifcid/validate.go#L17
multicodec=0x51|0x55|0x70|0x71|0x72|0x0129|0x0200 # based on boxo/gateway 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too much information, let's start simple for now


### `Server-Timing` (response header)

Optional. Implementations MAY use this header to communicate one or more
Expand Down
113 changes: 113 additions & 0 deletions src/ipips/ipip-0425.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
title: "IPIP-0425: Signaling Features on HTTP Gateways"
date: 2023-07-06
ipip: proposal
editors:
- name: Marcin Rataj
github: lidel
url: https://lidel.org/
relatedIssues:
- https://github.com/ipfs/specs/pull/402#pullrequestreview-1396116569
- https://github.com/ipfs/specs/pull/412#pullrequestreview-1427137365
order: 425
tags: ['ipips']
---

## Summary

Add ability to query HTTP Gateway for an explicit list of supported features.

## Motivation

A Gateway always ships with an opinionated set of supported hash functions and
IPLD codecs, and the differences between implementations will grow over time.

For example, some legacy gateways may not support newly added features like
`dag-scope` and `entity-bytes` from :cite[ipip-0402] or the ability to get some
block ordering guarantees introduced in :cite[ipip-0412]. Future IPIPs may add
more features and response formats.

We need a light mechanism for clients to detect which gateway supports partial CARs

## Detailed design

This IPIP introduces a set of HTTP headers returned in response to `OPTIONS` request:

The `Ipfs-Gateway-Features` header is used for signalling support for specific Gateway features to the client.

The lack of the header, or missing key-value pair within the header means support status is unknown.

Initial list of key-value pairs is documented in `Ipfs-Gateway-Features` section of :cite[path-gateway]

## Design rationale

There is a good prior art for this in web browsers where HTTP `OPTIONS` method
is used in [CORS Preflight request](https://developer.mozilla.org/en-US/docs/Glossary/Preflight_request)
that checks if the CORS protocol is understood and a server is aware using
specific methods and headers.

The `OPTIONS` request if often sent by web browser anyway, so we would not be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this oten a heavily cachable response so we'd be hitting the browser cache.

introducing much overhead.

### User benefit

Reduced cost for both client and gateway. Sending `OPTIONS` request will be
very inexpensive, especially when compared with current status quo where a
client has to send at least one request to probe a specific feature like
support forp Blake3 hash function or `dag-scope` or `entity-bytes`.

This translates to decreased latency and ability to choose the best retrieval
strategy faster.

### Compatibility

This IPIP is fully backward-compatible with browsers and existing IPFS
ecosystem. Gateways already return CORS headers with `OPTIONS` responses, we
will simply return additional headers with the same responses.

:::issue

For JavaScript running on web pages to be able to read `Ipfs-Gateway-Features`
header it MUST be safelisted via `Access-Control-Expose-Headers`.

:::

### Security

This IPIP does not introduce any new security concerns. Probing gateway for
supported features and hash functions is already possible via regular `GET`
requests.

### Alternatives

- Exposing the list of suported features via `GET

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that using OPTIONS here makes more sense for this use case than using .well-known. My argument is that this the HTTP Gateway API is an application-level protocol, and these are features for this application. .well-known on the other hand is for metadata about an origin, not a specific application.

This is also why libp2p/specs#508 uses .well-known for metadata on where application protocols are mounted. This is metadata about the origin.

/.well-known/ipfs/gateway/features-TBD` would also work, but:
- `/.well-known` is an additional top-level namespace than needs to be
explicitly exposed
- this introduces surface for path-related deployment bugs, where Nginx is
only exposing `/ipfs` namespace – in such case signaling endpoint would
not be exposed to the public internet
- this is a real problem, as all legacy deployments expose `/ipfs` and
`/ipns` and even when Kubo or other implementation adds `/.well-known` it
will not be exposed to the internet, breaking the feature detection
scheme, and making the gateway look like a legacy one
- requires additional HTTP GET, while in some cases HTTP OPTIONS is already
sent (e.g., [browser's preflight
requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#preflighted_requests)),
and would not introduce no overhead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(suggestion not working for me for some reason)

"and would not introduce overhead"

- if we have different services running on the same Origin, with
`.well-known` we need to make multiple GET requests for different
`.well-known/..` paths, vs sending a single HTTP OPTIONS and getting all
headers related to pre-existing path
- HTTP headers can be added as the rrqueest passes via reverse proxies, CDNs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request

and load balancers, allowing different services to announce support for
different features. this "just works" while mutating response body for
`.well-known` manifest introduces the need for additional middleware.

## Test fixtures

TODO

### Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).