Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should oras.Copy() follow manifests specified in the layers or the blobs field? #401

Open
Wwwsylvia opened this issue Jan 12, 2023 · 5 comments
Labels
bug Something isn't working enhancement New feature or request question Further information is requested
Milestone

Comments

@Wwwsylvia
Copy link
Member

Wwwsylvia commented Jan 12, 2023

Currently, oras.Copy() follows all the successors of a node to copy all the sub-DAGs.

For manifests specified in the layers field of OCI Image manifests or the blobs field of OCI Artifact manifests, should oras.Copy() treat them as leaf nodes or follow them to copy the sub-DAGs?

@Wwwsylvia Wwwsylvia added the bug Something isn't working label Jan 12, 2023
@shizhMSFT shizhMSFT added this to the future milestone Jan 12, 2023
@shizhMSFT shizhMSFT added the question Further information is requested label Jan 12, 2023
@shizhMSFT shizhMSFT modified the milestones: future, v2.0.0 Jan 12, 2023
@shizhMSFT shizhMSFT removed the question Further information is requested label Jan 12, 2023
@shizhMSFT shizhMSFT modified the milestones: v2.0.0, future Jan 13, 2023
@shizhMSFT shizhMSFT added enhancement New feature or request question Further information is requested labels Jan 13, 2023
@Wwwsylvia Wwwsylvia changed the title oras.Copy() should not follow manifests specified in the layers or the blobs field Should oras.Copy() follow manifests specified in the layers or the blobs field? Jan 13, 2023
@Wwwsylvia
Copy link
Member Author

Scenario 1: Repository

Suppose there is such a DAG, where manifest B references manifest A as one of its layers, and manifest list C references A* (identical to A) as one of its manifests.
When copying this DAG to a remote repository, how should oras.Copy() handle manifest A and A*?

Doubtlessly, manifest A* should be treated as a non-leaf node and should be pushed to the repository via the manifest endpoint.
But how about manifest A? If it is treated as a leaf node, should it be pushed to the repository via the blob endpoint?

  • If so, manifest A and A* need to be pushed twice and may be stored separately in the blob storage and manifest storage in the remote repository.

  • If not, manifest A and A* will be pushed just once via the manifest endpoint. If manifest A is copied first, blob E won't get copied along with it and will never be copied, since manifest A* will be skipped for copy.

graph TD

A[Manifest A]
AS[Manifest A*]
B[Manifest B]
C[Manifest List C]
D[Manifest List D]
E[Blob E]


A -.-> E
AS --> E
B -- layers --> A
C -- manifests --> AS
D --> B
D --> C
Loading

@Wwwsylvia
Copy link
Member Author

Scenario 2: OCI Layout

Suppose there is such a DAG, where manifest A is referenced by manifest B as a layer and is referenced by manifest list C as a manifest.
When copying this DAG to an OCI layout, oras.Copy() will copy manifest A only once, whether or not it treats manifest A as a leaf node (to be copied along with manifest B), since OCI layout stores manifests and blobs in the same storage.
But if manifest A is copied as a leaf node along with manifest B and this happens before manifest list C is copied, blob E will never get copied.

graph TD

A[Manifest A]
B[Manifest B]
C[Manifest List C]
D[Manifest List D]
E[Blob E]


A --> E
B -- layers --> A
C -- manifests --> A
D --> B
D --> C
Loading

@Wwwsylvia
Copy link
Member Author

Scenario 3: Repository Double CASs

Suppose the below DAG is being copied to a remote repository, should manifest A be pushed via the manifest endpoint or via the blob endpoint? Or should it be pushed twice via both endpoints?

graph TD

A[Manifest A]
B[Manifest B]

B -- layers --> A
B -- subject --> A
Loading

@Wwwsylvia
Copy link
Member Author

Interestingly, the docker buildx build command generates build caches like this: Putting layers in the manifests field of an OCI image index.
When copying such structure to a remote repository, should oras.Copy() push these layers (specified as manifests) via the manifest endpoint or the blob endpoint? 🤔

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1",
      "size": 32,
      "annotations": {
        "buildkit/createdat": "2023-01-13T07:49:09.921545067Z",
        "containerd.io/uncompressed": "sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:d74d7d17ce90514c5eed8068791ab9b1d58f355a367c6a87bd3e0e1dc8113500",
      "size": 105,
      "annotations": {
        "buildkit/createdat": "2023-01-13T07:49:09.864832789Z",
        "containerd.io/uncompressed": "sha256:601bb128dc20e9b8a296510b1c840d58dfd7d596ae1396d52e886753423c052c"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:df9b9388f04ad6279a7410b85cedfdcb2208c0a003da7ab5613af71079148139",
      "size": 2814559,
      "annotations": {
        "buildkit/createdat": "2023-01-13T07:48:28.219213701Z",
        "containerd.io/uncompressed": "sha256:4fc242d58285699eca05db3cc7c7122a2b8e014d9481f323bd9277baacfa0628"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:eb630b592770ba0b3982595e566c1027966cf6b9733c5fc1bf0794bf6bc2c9cd",
      "size": 3578366,
      "annotations": {
        "buildkit/createdat": "2023-01-13T07:49:09.693226029Z",
        "containerd.io/uncompressed": "sha256:3cb741a610a6253327467f4bb4e3de9397c36846b2407dc56992c04475ced968"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:ed0f0d4a18721d4dc5d5d8ffb7eaeb0df00ab5d1001bfa594419a1b8dd5ffc09",
      "size": 2581904,
      "annotations": {
        "buildkit/createdat": "2023-01-13T07:48:34.097896893Z",
        "containerd.io/uncompressed": "sha256:6d69e1b372ea8a2e13783b213f4cf108be422a05740625a209a054b48c9a76cd"
      }
    },
    {
      "mediaType": "application/vnd.buildkit.cacheconfig.v0",
      "digest": "sha256:f06f3ad8ce85bcc973c15a11f419e7601e74db8db0e7af8d05587d24d77ffc83",
      "size": 2407
    }
  ]
}

@Wwwsylvia
Copy link
Member Author

We may need to introduce a new method to return leaf successors and non-leaf successors separately, as a complement to content.Successors().

oras-go/content/graph.go

Lines 47 to 106 in 76382aa

// Successors returns the nodes directly pointed by the current node.
// In other words, returns the "children" of the current descriptor.
func Successors(ctx context.Context, fetcher Fetcher, node ocispec.Descriptor) ([]ocispec.Descriptor, error) {
switch node.MediaType {
case docker.MediaTypeManifest:
content, err := FetchAll(ctx, fetcher, node)
if err != nil {
return nil, err
}
// OCI manifest schema can be used to marshal docker manifest
var manifest ocispec.Manifest
if err := json.Unmarshal(content, &manifest); err != nil {
return nil, err
}
return append([]ocispec.Descriptor{manifest.Config}, manifest.Layers...), nil
case ocispec.MediaTypeImageManifest:
content, err := FetchAll(ctx, fetcher, node)
if err != nil {
return nil, err
}
var manifest ocispec.Manifest
if err := json.Unmarshal(content, &manifest); err != nil {
return nil, err
}
var nodes []ocispec.Descriptor
if manifest.Subject != nil {
nodes = append(nodes, *manifest.Subject)
}
nodes = append(nodes, manifest.Config)
return append(nodes, manifest.Layers...), nil
case docker.MediaTypeManifestList, ocispec.MediaTypeImageIndex:
content, err := FetchAll(ctx, fetcher, node)
if err != nil {
return nil, err
}
// docker manifest list and oci index are equivalent for successors.
var index ocispec.Index
if err := json.Unmarshal(content, &index); err != nil {
return nil, err
}
return index.Manifests, nil
case ocispec.MediaTypeArtifactManifest:
content, err := FetchAll(ctx, fetcher, node)
if err != nil {
return nil, err
}
var manifest ocispec.Artifact
if err := json.Unmarshal(content, &manifest); err != nil {
return nil, err
}
var nodes []ocispec.Descriptor
if manifest.Subject != nil {
nodes = append(nodes, *manifest.Subject)
}
return append(nodes, manifest.Blobs...), nil
}
return nil, nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request question Further information is requested
Projects
No open projects
Status: No status
Development

No branches or pull requests

2 participants