-
Notifications
You must be signed in to change notification settings - Fork 2
Inventory
< geoff.froh at densho.org > 2015-01-08
[T]he purpose of the Inventory is to provide the definitive, ground-truth registry for what Collection repos should exist (and which should not)...
The first part of the solution is the combination of partner repos and the "master" ddr repo that you've already deployed. This gives us a beachhead for establishing ground truth. Just as the models and controlled vocabs are now encapsulated in the ddr repo, we can place the Inventory or Collection registry in the partner repos. And, of course, a corresponding registry of partners in the overall ddr repo. This allows a future archivist to bootstrap up the complete picture of what should be found in the Repository.
While the Store documents perform useful functions (i.e., tracking the physical location of the annexes), they do not provide a registry of collections that should exist. There should be a single text document in each partner repo that contains a list of all of the collection repos that should exist. This ensures a simple single-point where collections are registered, and a clear, human-readable artifact that is not dependent on any specific technology (other than text!).
Distributed database consisting of ddr
repo and Volume files contained within ddr-PARTNER
repos.
The master inventory is created by parsing these files. In this way they are analogous to *.journal
files used by ledger-cli
.
Inventory tells
- Information about the Repository as a whole
- List of partners
- List of collections
- List of volumes in use For each volume,
- type, location, list of partners, collections For each collection,
- location of instances and their type
- Present on every
ddr
volume. - Metadata about the big-R Repository (keyword, title, description, logo, etc).
- model fields
- controlled vocab (topics, etc).
- pointers to
ddr-PARTNER
repos.
ddr-PARTNER/
├── organization.json
└── volumes
├── 297e76ea-bb22-40e9-9999-0c9be5332a39.json
├── 297e76ea-bb22-40e9-9999-0c9be5332a39.log
├── 408A51BE8A51B160.json
└── 408A51BE8A51B160.log
- Present on every
ddr
volume used by a partner. - Metadata about the partner (keyword, title, description, logo, etc).
- Volumes in use by the partner.
-
*.json
and*.log
files withinddr-PARTNER
repos. - Filename:
{uuid}.json
,{uuid}.log
.
*.json
file contains
- device info (devicetype, filesystem, label, UUID, size, purchase/create date, etc.
- list of partner's collections on the device.
- Information about each collection clone/instance.
*.log
file contains
- modifications to the Volume (collection created, cloned, removed) # modified?
- timestamp and user for each modification.
Note: A volume file may not list all the collections on the volume! If a volume contains collections from multiple partners, it will be necessary to read multiple volume files from different partners to see what is on a specific volume.
Note: I don't think there is a way to recreate the history of a particular instance of a collection on a particular volume. We'll just have to start each *.log
file with a notice that collections were present on Volume X when the logfile was created.
Each collection record in a {volume}.json
file contains
- collection ID
- git-annex UUID
- level (meta, access, all, ???)
- local annex keys, size
- known annex keys, size
"repo": "ddr"
"org": "testing",
"type": "hdd",
"fstype": "ext4",
"uuid": "297e76ea-bb22-40e9-9999-0c9be5332a39",
"size": "1072693248",
"label": "ddrworkstation",
"location": "Pasadena",
"collections": [
{
"cid": "ddr-testing-141",
"uuid": "dfb5f708-c901-11e4-b1e1-e3fff14a483d",
"level": "all",
"local annex keys": "12",
"local annex size": "128 MB",
"known annex keys": "12",
"known annex size": "128 MB",
}
]
}```
### USB volume `{UUID}.json`
```{
"repo": "ddr"
"org": "testing",
"type": "usb",
"fstype": "ntfs",
"uuid": "408A51BE8A51B160",
"size": "500096991232",
"label": "WD5000BMV-2",
"location": "Pasadena",
"purchase_date": "2013-03-01",
"collections": [
{
"cid": "ddr-testing-228",
"uuid": "e93ab2f4-7a4d-11e3-b2cf-37a8fc974942",
"level": "meta",
"local annex keys": "0",
"local annex size": "0 MB",
"known annex keys": "12",
"known annex size": "128 MB",
}
]
}```
### `SAMPLE-UUID.log`
2015-05-19T14:07-0800 gjost Created logfile. 2015-05-19T14:07-0800 gjost exists ddr-test-123 2015-05-19T14:07-0800 gjost exists ddr-test-246 2015-05-19T14:07-0800 gjost cloned ddr-test-247 2015-05-19T14:07-0800 gjost created ddr-test-248 2015-05-19T14:07-0800 gjost deleted ddr-test-248