Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple works within one object #329

Merged
merged 3 commits into from
Mar 27, 2019
Merged

Multiple works within one object #329

merged 3 commits into from
Mar 27, 2019

Conversation

kieranjol
Copy link
Owner

This is the dreaded patch for when there are multiple works represented in a single object, EG 4 episodes on one tape, where there is one descriptive metadata record per episode.
So far, this PR allows accession.py to accept multiple reference numbers on the command line in this form: -reference af12345+ac432+ab567 etc.

This will then result in the following package (i used different args here but you get the picture):
note the multiple filmographic and pbcore csvs that have the name: REF_accessionNumber_pbcore.csv

tree /Users/zlad/Desktop/staging/aaa1645 
/Users/zlad/Desktop/staging/aaa1645
├── a5efe037-004c-4494-a826-8c23c6906132
│   ├── logs
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132.mkv_manifest.md5
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_framemd5.log
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_normalise.log
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_sip_log.log
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_source_dfxml.xml_manifest.md5
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_source_mediainfo.xml_manifest.md5
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_source_mediatrace.xml_manifest.md5
│   │   └── metadata_manifest.md5
│   ├── metadata
│   │   ├── AF1235_filmographic.csv
│   │   ├── AF4567_filmographic.csv
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132.framemd5
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132.mkv_mediainfo.xml
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132.mkv_mediatrace.xml
│   │   ├── a5efe037-004c-4494-a826-8c23c6906132_dfxml.xml
│   │   ├── aaa1645_AF1235_pbcore.csv
│   │   ├── aaa1645_AF4567_pbcore.csv
│   │   ├── bars_v210_pcm24le_stereo_20sec.mov_source.framemd5
│   │   └── supplemental
│   │       ├── a5efe037-004c-4494-a826-8c23c6906132_source_dfxml.xml
│   │       ├── a5efe037-004c-4494-a826-8c23c6906132_source_mediainfo.xml
│   │       └── a5efe037-004c-4494-a826-8c23c6906132_source_mediatrace.xml
│   └── objects
│       └── a5efe037-004c-4494-a826-8c23c6906132.mkv
├── a5efe037-004c-4494-a826-8c23c6906132_manifest-sha512.txt
└── a5efe037-004c-4494-a826-8c23c6906132_manifest.md5

@kieranjol
Copy link
Owner Author

review by @raecasey and @ecodonohoe greatly appreciated..
BTW this will also result in use needing to have multiple titles in the accession register:
Muintir na Meara Season 4 Episode 1| Muintir na Meara Season 4 Episode 2 etc
We might be temped to mirror the tape accessions register and say Muintir na Meara Season 4 Episodes 1-4 but this might not be the most consistent, as the Loopline project has many examples where there are rushes from one project mixed with another, eg:
Patrick Scott Rushes 42|Essies Last Stand Rushes 4

THE JOYS OF AV ARCHIVING

@kieranjol
Copy link
Owner Author

Also you might be wondering why + is the delimiter... well the | is already meaningful for us, it seperates things like representation of X | reproduction of Y, and I also tried & and , as delimiters but these are meaningful to bash and result in pretty broken scripts. So + seemed like something that worked on an intuitive and systems level.

@kieranjol kieranjol changed the title Multiple works Multiple works within one object Mar 8, 2019
@mcampos-quinn
Copy link
Contributor

GODSPEED

@kieranjol
Copy link
Owner Author

Merciiiiiii :(

@kieranjol
Copy link
Owner Author

review by @raecasey and @ecodonohoe greatly appreciated..
BTW this will also result in use needing to have multiple titles in the accession register:
Muintir na Meara Season 4 Episode 1| Muintir na Meara Season 4 Episode 2 etc
We might be temped to mirror the tape accessions register and say Muintir na Meara Season 4 Episodes 1-4 but this might not be the most consistent, as the Loopline project has many examples where there are rushes from one project mixed with another, eg:
Patrick Scott Rushes 42|Essies Last Stand Rushes 4

THE JOYS OF AV ARCHIVING

So I think that I was wrong here. There will be examples of 10+ Works on a tape, and these will often be episodes of a series. I think that we should just say general things like Episodes 1-4, Episodes 4-12 just to keep the registers from being really messy.
There will be instances though where we have to be specific about the multiple titles, like the aforementioned Patrick Scott Rushes 42|Essies Last Stand Rushes 4

@raecasey
Copy link
Contributor

wow, - mighty work.
it's a total headmelt.
the patch and what you've done looks great.
Are there going to be other examples where more than one filmographic is used but the object does not contain multiple works? might be a silly question, - but worth checking anythign that might clash with this in the future.

also, I can't see any other way around the accession register other than what you have described above. it's the object we are accessioning. we can't have multiple rows/lines for one accession number and we should fully list the content, - but it's just the simple name that will have multiple entries, - format and reproduction will just be listed once.
so i'd be in favour of what you have described

@kieranjol
Copy link
Owner Author

kieranjol commented Mar 13, 2019

wow, - mighty work.

merci :)

it's a total headmelt.
the patch and what you've done looks great.
Are there going to be other examples where more than one filmographic is used but the object does not contain multiple works? might be a silly question, - but worth checking anythign that might clash with this in the future.

What would this look like? Is it something seperate audio and image reels that are for the same work? Would these film reels have the same filmographic but different technical records? An example would help me jog my brain.

also, I can't see any other way around the accession register other than what you have described above. it's the object we are accessioning. we can't have multiple rows/lines for one accession number and we should fully list the content, - but it's just the simple name that will have multiple entries, - format and reproduction will just be listed once.
so i'd be in favour of what you have described

Cool, so things like Muintir na Meara Episodes 1-4 but allowing for piping for outliers like Patrick Scott (Rushes 4) | James Gandon (Rushes 58)?

Also if the trees look good to you, I might go ahead and accession the tests I did. I can send you the location of them if you wanna take a look.

Thx for the review.

@raecasey
Copy link
Contributor

raecasey commented Mar 13, 2019

yep -

wow, - mighty work.

merci :)

it's a total headmelt.
the patch and what you've done looks great.
Are there going to be other examples where more than one filmographic is used but the object does not contain multiple works? might be a silly question, - but worth checking anythign that might clash with this in the future.

What would this look like? Is it something seperate audio and image reels that are for the same work?
Would these film reels have the same filmographic but different technical records? An example would help me jog my brain.

i think that example would be ok, as it's essentially something akin to a restoration, - so one filmographic and multiple tech recs. that's easily enough solvable i think, - with some work arounds. who knows what radharc might throw up though.
what i was thinking was more like if something belonged to a compilation. so it has it's own filmographic, but is also a partial representation of another filmographic.
and possibly other examples we haven't thought of yet. most likely existing in the radharc collection.

also, I can't see any other way around the accession register other than what you have described above. it's the object we are accessioning. we can't have multiple rows/lines for one accession number and we should fully list the content, - but it's just the simple name that will have multiple entries, - format and reproduction will just be listed once.
so i'd be in favour of what you have described

Cool, so things like Muintir na Meara Episodes 1-4 but allowing for piping for outliers like Patrick Scott (Rushes 4) | James Gandon (Rushes 58)?
yes, - exactly. we'll have to be careful the first few times. maybe a bunch of us look at it once it is done, to get feedback.

Also if the trees look good to you, I might go ahead and accession the tests I did. I can send you the location of them if you wanna take a look.

cool, - i got the location, thanks, - i think you can go ahead.

Thx for the review.

@raecasey
Copy link
Contributor

oops, - sorry for not indenting

@kieranjol
Copy link
Owner Author

yep -

wow, - mighty work.

merci :)

it's a total headmelt.
the patch and what you've done looks great.
Are there going to be other examples where more than one filmographic is used but the object does not contain multiple works? might be a silly question, - but worth checking anythign that might clash with this in the future.

What would this look like? Is it something seperate audio and image reels that are for the same work?
Would these film reels have the same filmographic but different technical records? An example would help me jog my brain.

i think that example would be ok, as it's essentially something akin to a restoration, - so one filmographic and multiple tech recs. that's easily enough solvable i think, - with some work arounds. who knows what radharc might throw up though.

I think is similar to what @gavinrichardmartin encountered yesterday, where that may not need any extra code. It could be solved by just saying 'Reproduction of IFA-2001-12, IFA-2014-8876, MV1234', in the CSV and that's how it would appear in the Tape Origin field in the PBCore/Tech record. Ideally we'd delimit with pipes there but doing so in the CSV would possibly break things at the moment. The 'Reproduction of' values are not as essential as the 'Representation of' reference numbers, as these are absolutely essential for pulling in the correct filmographic records and for harvesting film titles for registers, and even more.. Actually I just remembered that bash might have problems with commas so might need to use + here again as well to seperate the values..

what i was thinking was more like if something belonged to a compilation. so it has it's own filmographic, but is also a partial representation of another filmographic.

I dunno, I think in this instance, taking 4 provinces as an example, we could either - not migrate this tape at all as it's an abomination - or just use the 4 provinces reference number only, and ignore that it's comprised of other individual works that we also hold seperately.

and possibly other examples we haven't thought of yet. most likely existing in the radharc collection.

Yes, I think amateur collections might have examples of this too. I wonder if this is the kind of thing where the issue is actually resolved outside of any of these workflows, and any issues with those records is fixed at source within the actual database records? I'm struggling to think of examples right now though.

also, I can't see any other way around the accession register other than what you have described above. it's the object we are accessioning. we can't have multiple rows/lines for one accession number and we should fully list the content, - but it's just the simple name that will have multiple entries, - format and reproduction will just be listed once.
so i'd be in favour of what you have described

Cool, so things like Muintir na Meara Episodes 1-4 but allowing for piping for outliers like Patrick Scott (Rushes 4) | James Gandon (Rushes 58)?
yes, - exactly. we'll have to be careful the first few times. maybe a bunch of us look at it once it is done, to get feedback.
Also if the trees look good to you, I might go ahead and accession the tests I did. I can send you the location of them if you wanna take a look.

cool, - i got the location, thanks, - i think you can go ahead.

Savage, I'll plough ahead so maybe tomorrow.

Thx for the review.

@raecasey
Copy link
Contributor

so i started writing a big long reply yesterday and then confused myself. so i'll just show you.
it's essentially about adverts. mulitple adverts on one filmographic skeletal record, with one tech record as they are on one reel. however, they were scanned individually, so may potentially get second individual filmographics.
it won;t affect your proceeding with this multiple works progress in any case, - as i think we need to solve the problem at source.
and yes, - four provinces is an abomination.

@kieranjol
Copy link
Owner Author

kieranjol commented Mar 14, 2019 via email

@kieranjol kieranjol merged commit 822b2c5 into master Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants