Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early Discussion: New save format to reduce save size and number of files #5771

Open
randombk opened this issue Nov 28, 2024 · 5 comments
Open
Labels
enhancement src changes related to source code.

Comments

@randombk
Copy link
Contributor

randombk commented Nov 28, 2024

Is your feature request related to a problem? Please describe.

Context and Motivation

CBN currently stores saves as loose uncompressed JSON files. Each map chunk is a separate file. Each player-chunk tuple is another file.

With very little exploration, CBN saves balloon to over a GB spread out over many hundred thousand individual files. This slows down saving/loading and makes filesystem operations on saves particularly slow (i.e. backups, syncing across machines, etc).

Describe the solution you'd like

Proposal

Migrate to a new save format where map tiles are stored as compressed JSON blobs in a SQLite3 database.

  • This would consolidate the multiple loose files, reduce storage sizes, all while minimizing the incremental complexity and dependencies introduced.
  • Very minimal changes to the codebase are needed - no JSON schemas are changed, only the filesystem read/write layer.
  • Migration to SQLite (and ACID transactions) would also help reduce save corruption.
  • This format would also be easy to manually manipulate if/when necessary.
  • Maintain backwards compatibility and incremental migration for existing saves.
  • Improved efficiency of save format would also potentially open the door to saving more detailed information about the world in the future.

The planned save format will contain one sqlite db for the game world, plus one for each character. Specifically:

map.db replaces:

  • maps/ directory (map tile info)
  • 0.* files (overmap info)

<player_id>.db replaces:

  • <player_id>.seen.* files (player overmap visibility data)
  • <player_id>.mm1/ directory (player map tile memory)

All other files will remain unchanged. This should cover all the worst offenders while leaving the more manual-edit-friendly files untouched.

Each database will consist of a single table:

CREATE TABLE IF NOT EXISTS files (
    path           TEXT PRIMARY KEY NOT NULL,
    parent         TEXT NOT NULL,  -- Currently unused, but adding here for future compat considerations
    compression    TEXT DEFAULT NULL,  -- NULL for uncompressed. 'zlib' if `data` is compressed.
    data           BLOB NOT NULL
);

Proposed Migration Path

Phase 1. [#5774] Backend cleanup of the world save/load codebase to make it easier to support multiple save formats. Keep the format unchanged at this stage.
Phase 2. Implement the new format under an experimental feature flag/option. There should be a migration path for old saves.
Phase 3. Roll the format out as the default format. Keep backwards compatibilty for a long/indefinite period of time.
Phase 4. (Optional) deprecate the old format, and keep around a script to upgrade from the old format

Additional context

No response

@Coolthulhu
Copy link
Member

The reason for multiple save files is that we're semi-often modifying the files separately.
One benefit to current implementation is that the results are human-readable. Would the SQLite variant allow easy access to the data itself?

@thedyze
Copy link
Contributor

thedyze commented Nov 28, 2024

How about simple zip files? Would allow any user to have easy access.

@Zireael07
Copy link
Contributor

@thedyze Performance penalty for packing/unpacking and very tiny benefit (slightly beter organization)

@randombk
Copy link
Contributor Author

randombk commented Nov 28, 2024

@Coolthulhu The current thinking is that files like master.gsav, worldoptions.json, etc that are fixed in number are more trivially hand-edited would stay unchanged.

The emphasis is really on the map tile & visibility data (files in maps/, o.0.0 and co, *.seen.*, etc).

  • These are best edited via the debug editor in any case. Ppl probably shouldn't be hand-editing massive JSON arrays.
  • The format itself should be easy to pack/unpack on the command line via the sqlite3 tool, and we'll probably provide a few scripts to make it easier, and to convert between v1 and v2 save formats.

@thedyze I decided against zip as the API is a little weirder and we'd lose the ability to attach extra metadata if needed. SQL just provides a bunch more flexibility.

SQLite also gives some nice bonuses around ACID transactions so we'll finally be able to fix the crashed-during-save corruption.

@RoyalFox2140
Copy link
Collaborator

I am very late but I hope anything we do is going to allow migrations of save files to the new format. This doesn't need to be in the main release, if it's not difficult to supply a migration change for saves it could be a separate tool. We can still gut out all the old data and make users upgrade to the new format, we would just ideally supply a tool for it.

Users right now can pretty much load a save from early DDA and use each stable version of DDA and then BN to upgrade one version at a time, which is a side effect of DDA/BN's intent on maintaining migration and lack of major overhauls to the save data for years. I am not worried about disrupting saves older than a few months however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement src changes related to source code.
Projects
None yet
Development

No branches or pull requests

5 participants