Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implement Delete for OCI Layout #582

Closed

Conversation

wangxiaoxuan273
Copy link
Contributor

@wangxiaoxuan273 wangxiaoxuan273 commented Sep 1, 2023

Attempt to resolve #454

Based on the implementation of #470, but with RWMutex.

DeletableStore is basically copied from oci.Store with an added RWMutex.
DeletableMemory is basically copied from graph.Memory with an added RWMutex.

@wangxiaoxuan273 wangxiaoxuan273 marked this pull request as draft September 1, 2023 08:19
@wangxiaoxuan273 wangxiaoxuan273 changed the title draft PR to implement OCI delete [WIP] Implement Delete for OCI Layout Sep 1, 2023
Comment on lines 139 to 121
// There is no data consistency issue as long as deletion is not implemented
// for the underlying storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the note to RemoveFromIndex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the lock to the DeletableStore level.

@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2023

Codecov Report

Merging #582 (6a4ade0) into main (cb8c8bc) will decrease coverage by 2.02%.
The diff coverage is 53.46%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##             main     #582      +/-   ##
==========================================
- Coverage   74.64%   72.63%   -2.02%     
==========================================
  Files          59       62       +3     
  Lines        5266     5635     +369     
==========================================
+ Hits         3931     4093     +162     
- Misses        983     1162     +179     
- Partials      352      380      +28     
Files Changed Coverage Δ
internal/container/set/set.go 77.77% <0.00%> (-22.23%) ⬇️
internal/resolver/memory.go 90.00% <0.00%> (-10.00%) ⬇️
content/oci/deletableoci.go 42.45% <42.45%> (ø)
content/oci/storage.go 61.25% <57.14%> (-0.40%) ⬇️
internal/graph/deletablememory.go 85.00% <85.00%> (ø)

... and 1 file with indirect coverage changes

@@ -31,6 +31,7 @@ import (
// Memory is a memory based PredecessorFinder.
type Memory struct {
predecessors sync.Map // map[descriptor.Descriptor]map[descriptor.Descriptor]ocispec.Descriptor
successors sync.Map // map[descriptor.Descriptor]map[descriptor.Descriptor]ocispec.Descriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the successor list of a node is immutable and we don't need random access on the successors, the mapping can be from a descriptor to a successor list (map[descriptor.Descriptor][]ocispec.Descriptor).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed both successors and predecessors to the regular map, and added locks in index and Remove.

key := descriptor.FromOCI(node)
m.predecessors.LoadOrStore(key, &sync.Map{})
m.successors.LoadOrStore(key, &sync.Map{})
m.indexed.LoadOrStore(key, &sync.Map{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mark a node as indexed after it gets indexed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the node's presence in indexed means it has been indexed? Do we need an extra mark?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation marks the node as indexed before it gets indexed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I did not notice that I changed the behavior of indexed from the original behavior. Changed it back now. Currently indexed is updated after the node gets indexed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable indexed has been removed now.

}
}

func (m *Memory) indexIntoMemory(ctx context.Context, node ocispec.Descriptor) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it called indexIntoMemory? Everything is in memory in this file. 🤔

Copy link
Contributor Author

@wangxiaoxuan273 wangxiaoxuan273 Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it as createEntriesInMaps.


func (m *Memory) removeFromMemory(ctx context.Context, node ocispec.Descriptor) {
key := descriptor.FromOCI(node)
m.predecessors.Delete(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not want to delete the key from predecessors when a node is deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? Do we have a scenario of still needing it after the node is deleted? The information is stored in the node's predecessors' successors anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get the predecessors of a deleted node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the entry back. Currently we don't delete the entry in predecessors when doing Delete.


// loadIndexFile reads index.json from the file system.
// Create index.json if it does not exist.
func (ds *DeletableStore) loadIndexFile(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this part was not locked (I can't recall why not, probably a bug😟). Should we lock it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it was because loadIndexFile is only called on initialization. We need a comment for this then. (I wrote this code bug I already forget it😅)

Copy link
Contributor Author

@wangxiaoxuan273 wangxiaoxuan273 Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to. This function is only used in New() and I guess it's called only once?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but people (like me) might not realize that😂

Comment on lines 120 to 121
// RemoveFromIndex removes the node from its predecessors and successors.
func (m *Memory) RemoveFromIndex(ctx context.Context, node ocispec.Descriptor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we just call it Remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed as Remove.

@@ -116,19 +117,54 @@ func (m *Memory) Predecessors(_ context.Context, node ocispec.Descriptor) ([]oci
return res, nil
}

// RemoveFromIndex removes the node from its predecessors and successors.
func (m *Memory) RemoveFromIndex(ctx context.Context, node ocispec.Descriptor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race condition where the same node is indexed and removed at the same time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, what if another goroutine calls Index() when the current goroutine just about to hit the line 132?

)

// 条件:
// 1. 只要node被index(作为函数的第二个输入),node就在predecessors,successors,indexed中有entry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please translate the comments to English? =)

Copy link
Contributor Author

@wangxiaoxuan273 wangxiaoxuan273 Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it or put it formally in English in later versions. Currently this is a draft PR, and I wrote this comment to remind myself the designs, and the designs may change anytime.

type MemoryWithDelete struct {
indexed sync.Map // map[descriptor.Descriptor]any, this variable is only used by IndexAll
predecessors map[descriptor.Descriptor]map[descriptor.Descriptor]ocispec.Descriptor
successors map[descriptor.Descriptor]map[descriptor.Descriptor]ocispec.Descriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be of type map[descriptor.Descriptor][]ocispec.Descriptor?
See #582 (comment)


// MemoryWithDelete is a MemoryWithDelete based PredecessorFinder.
type MemoryWithDelete struct {
indexed sync.Map // map[descriptor.Descriptor]any, this variable is only used by IndexAll
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this can be moved into IndexAll.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to avoid making changes to IndexAll, until I'm sure it's safe to do it. I still have questions about the behavior of IndexAll.

Comment on lines 114 to 121
_, exists := m.predecessors[key]
if !exists {
return nil, nil
}
var res []ocispec.Descriptor
for _, v := range m.predecessors[key] {
res = append(res, v)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
_, exists := m.predecessors[key]
if !exists {
return nil, nil
}
var res []ocispec.Descriptor
for _, v := range m.predecessors[key] {
res = append(res, v)
}
predecessors, exists := m.predecessors[key]
if !exists {
return nil, nil
}
var res []ocispec.Descriptor
for _, v := range predecessors {
res = append(res, v)
}

?


func (m *MemoryWithDelete) createEntriesInMaps(ctx context.Context, node ocispec.Descriptor) {
key := descriptor.FromOCI(node)
if _, hasEntry := m.predecessors[key]; !hasEntry {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a simple ok should be good enough.

// There is no data consistency issue as long as deletion is not implemented
// for the underlying storage.
func (m *MemoryWithDelete) index(ctx context.Context, node ocispec.Descriptor, successors []ocispec.Descriptor) {
m.createEntriesInMaps(ctx, node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need createEntriesInMaps and removeEntriesFromMaps as they are used in only one place. Just expanding them may make the caller function more readable?

Copy link
Contributor Author

@wangxiaoxuan273 wangxiaoxuan273 Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to have these two functions for maintainability and modularity. The exact behaviors of predecessors, successors and other variables at indexing and deleting may change later as we continue to improve our design, and these two functions reduce the chance of errors.

I originally did not have these two functions, but as I keep working on this feature, I've found them being functions quite useful.

@@ -186,6 +186,25 @@ func loadIndex(ctx context.Context, index *ocispec.Index, fetcher content.Fetche
return nil
}

// loadIndex loads index into memory.
func loadIndexWithMemoryWithDelete(ctx context.Context, index *ocispec.Index, fetcher content.Fetcher, tagger content.Tagger, graph *graph.MemoryWithDelete) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be defined in deletableOci.go as it has nothing to do with readonlyoci.go.

Copy link
Contributor Author

@wangxiaoxuan273 wangxiaoxuan273 Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks. Moved it to deletableOci.go.

internal/graph/memoryWithDelete_test.go Outdated Show resolved Hide resolved
internal/resolver/memory.go Show resolved Hide resolved
internal/graph/memoryWithDelete_test.go Outdated Show resolved Hide resolved
@wangxiaoxuan273 wangxiaoxuan273 force-pushed the draft-oci-delete branch 2 times, most recently from fb078a0 to 91b86d9 Compare September 12, 2023 08:01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golang file names use underscores like deletable_memory.go

Copy link
Contributor Author

@wangxiaoxuan273 wangxiaoxuan273 Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Updated the file names to be deletableoci.go and deletablememory.go, to be consistent with other file names in the repo.

@@ -33,3 +33,8 @@ func (s Set[T]) Contains(item T) bool {
_, ok := s[item]
return ok
}

// Delete deletes an item from the set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// Delete deletes an item from the set
// Delete deletes item from the set.

See https://tip.golang.org/doc/comment#func.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the period.

}

// Index indexes predecessors for each direct successor of the given node.
// There is no data consistency issue as long as deletion is not implemented
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment.

m.lock.RLock()
defer m.lock.RUnlock()
key := descriptor.FromOCI(node)
set, exists := m.predecessors[key]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it be better to call it predecessors instead of set?

}

// loadIndexInDeletableMemory loads index into the memory.
func loadIndexInDeletableMemory(ctx context.Context, index *ocispec.Index, fetcher content.Fetcher, tagger content.Tagger, graph *graph.DeletableMemory) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to maintain DeletableStore separately, this function can be renamed to
func (ds *DeletableStore) loadIndex since it is only used by DeletableStore.

Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
Signed-off-by: Xiaoxuan Wang <[email protected]>
TerryHowe pushed a commit that referenced this pull request Oct 2, 2023
Part 2/4 of #454 

Based on draft PR #582

---------

Signed-off-by: Xiaoxuan Wang <[email protected]>
Co-authored-by: Terry Howe <[email protected]>
TerryHowe pushed a commit that referenced this pull request Oct 7, 2023
Signed-off-by: Xiaoxuan Wang <[email protected]>

Part 3/4 of #454

Based on draft PR #582

---------

Signed-off-by: Xiaoxuan Wang <[email protected]>
Wwwsylvia pushed a commit that referenced this pull request Oct 16, 2023
Signed-off-by: Xiaoxuan Wang <[email protected]>

Part 1/4 of #454

Based on draft PR #582 

Current behavior regarding `graph.Memory.Remove(node)`:
* `node` entry in `m.successors` and `m.nodes` is removed.
* `node` is removed from its successors predecessors list.
* `node` entry in `m.predecessors` is NOT removed, **unless all its
predecessors no longer exist**.
* `node` is NOT removed from its predecessors' `m.successors` list. The
`m.successors` is always in accordance with the actual content.

Signed-off-by: Xiaoxuan Wang <[email protected]>
@wangxiaoxuan273 wangxiaoxuan273 deleted the draft-oci-delete branch November 27, 2023 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

oci.Store and oci.Storage should implement content.Deleter
5 participants