Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit mechanism on local filesystem storage is not safe #804

Open
4 tasks done
rabernat opened this issue Mar 4, 2025 · 0 comments
Open
4 tasks done

Commit mechanism on local filesystem storage is not safe #804

rabernat opened this issue Mar 4, 2025 · 0 comments

Comments

@rabernat
Copy link
Contributor

rabernat commented Mar 4, 2025

What happened?

The script below demonstrates two processes clobbering each others commits on local file storage.

What did you expect to happen?

If you use object storage, the retry loop is triggered and things work as expected

Minimal Complete Verifiable Example

import multiprocessing as mp
import icechunk as ic
import zarr


def get_storage():

    # note: commits are actually not safe with local storage due to limitations of object_store
    storage = ic.local_filesystem_storage("data.icechunk")
    #storage = ic.s3_storage(bucket="icechunk-test", prefix="zarr_issue_2868", from_env=True)
    return storage


def worker(i):
    print(f"Stated worker {i}")
    storage = get_storage()
    repo = ic.Repository.open(storage)
    # keep trying until it succeeds
    while True:
        try:
            session = repo.writable_session("main")
            z = zarr.open(session.store, mode="r+")
            print(f"Opened store for {i} | {dict(z.attrs)}")
            a = z.attrs.get("done", [])
            a.append(i)
            z.attrs["done"] = a
            session.commit(f"wrote from worker {i}")
            break
        except ic.ConflictError:
            print(f"Conflict for {i}, retying")
            pass


def main():

    storage = get_storage()
    repo = ic.Repository.create(storage)
    session = repo.writable_session("main")

    zarr.create(
        shape=(10, 10),
        chunks=(5, 5),
        store=session.store,
        overwrite=True,
    )
    session.commit("initialized dataset")

    p1 = mp.Process(target=worker, args=(1,))
    p2 = mp.Process(target=worker, args=(2,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

    session = repo.readonly_session(branch="main")
    z = zarr.open(session.store, mode="r")
    print(z.attrs["done"])
    print(list(repo.ancestry(branch="main")))


if __name__ == "__main__":
    main()

output

Stated worker 1
Stated worker 2
Opened store for 1 | {}
Opened store for 2 | {}
[2]
[SnapshotInfo(id="0V5P33GJ73GDTAKTHDJ0", parent_id=XWCPJQ68NXQARYF31SYG, written_at=datetime.datetime(2025,3,4,22,18,22,59377, tzinfo=datetime.timezone.utc), message="wrote from..."), SnapshotInfo(id="XWCPJQ68NXQARYF31SYG", parent_id=DW4F100ZG4HPGVP8PRKG, written_at=datetime.datetime(2025,3,4,22,18,21,896803, tzinfo=datetime.timezone.utc), message="initialize..."), SnapshotInfo(id="DW4F100ZG4HPGVP8PRKG", parent_id=None, written_at=datetime.datetime(2025,3,4,22,18,21,894492, tzinfo=datetime.timezone.utc), message="Repository...")]

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example runs when copied & pasted into an fresh python environment.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

Anything else we need to know?

No response

Environment

platform: macOS-14.4.1-arm64-arm-64bit
python: 3.12.6
icechunk: 0.2.5
zarr: 3.0.4
numcodecs: 0.15.0

@rabernat rabernat changed the title Commit mechanism is not safe on local filesystem storage is not safe Commit mechanism on local filesystem storage is not safe Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant