Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert mb_artistid and mb_albumartistid to lists #46

Merged
merged 1 commit into from
Jun 27, 2021

Conversation

JuniorJPDJ
Copy link
Contributor

@JuniorJPDJ JuniorJPDJ commented May 13, 2021

Resolves #43.

@JuniorJPDJ JuniorJPDJ changed the title Convert mb_artistid and mb_albumartistid to lists lists Convert mb_artistid and mb_albumartistid to lists May 18, 2021
@JuniorJPDJ JuniorJPDJ marked this pull request as ready for review May 18, 2021 00:22
@JuniorJPDJ
Copy link
Contributor Author

Rebased on top of master.

@sampsyo
Copy link
Member

sampsyo commented May 18, 2021

Alright, so this is the tricky one. On the surface, it's a simple change; unfortunately, the interaction with reality is complicated.

Probably most importantly (and as discussed elsewhere), changing the type of these existing fields would break current software that depends on these fields—most importantly, it would break beets (and beets plugins). This change would be tricky to get right everywhere, especially considering third-party plugins that might rely on the string behavior. Providing a second interface to get list-style access to the same data would be awesome.

Second, there is a broader issue here about how to deal with multi-value tags in MediaFile. It would be really great to come up with some sort of consistent way to expose list-formatted data. However, there are many complicating factors:

  • Most fields can be made into lists for most formats, but not all. So a blanket policy of making things lists would end up with weird exceptions.
  • Some fields don't feel like they would meaningfully want to be lists, but exactly which is a frustratingly human judgment call. (Should multiple titles be supported? Multiple release dates?)
  • This sounds silly, but non-list values are just plain easier to deal with than list values. The common case for genre, for instance, is that it works just fine to expose a single string—and MediaFile makes it a lot easier to deal with when all you want is that common case. More advanced use cases definitely would like to store multiple genres. But a good design would "make the easy things easy and the hard things possible."

So generally speaking, there is a deeper problem to be solved here about how to expose both list-shaped and non-listy versions of data—for different use cases and different situations. I honestly don't know what to do about it. But I confess some discomfort with introducing lists of data without more clarity on what the grand plan is.

@JuniorJPDJ
Copy link
Contributor Author

JuniorJPDJ commented May 18, 2021

Actual problem with those values is that files are already tagged with this fields as multi-valued by most of software and I'm not sure why those are NOT lists at the first place. Musicbrainz itself specifies those tags as multi-valued as you need to write information about all artist IDs to the same tag. mediafile seems to expose only one of all values in lists (first one AFAIR) in case of most of formats and some concatenated strings in case of ID3v2.3 (as shown in #43) - this behaviour is bad and inconsistent.

Most fields can be made into lists for most formats, but not all. So a blanket policy of making things lists would end up with weird exceptions.

Is there any format beyond ID3v<=2.3 that doesn't support those? This was the only format I had problems with when testing this tbh and those two fields have already applied solution for problem with ID3v<=2.3.

Some fields don't feel like they would meaningfully want to be lists, but exactly which is a frustratingly human judgment call. (Should multiple titles be supported? Multiple release dates?)

I don't get what you are talking about now - fields you mentioned are not lists and are not supposed to be lists. Only few of fields are made to be lists and part of those were already implemented before I tried to deal with multi-valued fields ;O

In sake of backwards compatibility I probably could rename those fields to mb_artistids and mb_albumartistids and provide backward-compatible versions of those in similar fashion like genre and genres are created, but IMO it still would be bad (better than present implementation tho).

I'm not sure what will happen if you for example overwrite first value using single-value tag when list would contain more than one value. Will it remove other values? Leave other values untouched and just change first? Second option creates loads of bugs in end software.

But I confess some discomfort with introducing lists of data without more clarity on what the grand plan is.

There's no grand plan ;D
I just wanted to fix bug I found as I gonna use mediafile in my software (and probably try to convince Funkwhale devs to use it too).

And yes, this is a bug - mediafile was interpreting files tagged by other software (in case of my tests - Picard) in inconsistent and not intended way, probably also destroying already tagged files (I haven't checked it tho).

EDIT:
Just checked if it is destroying tagged files and it's not!
It's not that bad!
It seems not to touch tags which are not changed by user.
I'll probably need to check the same in case of first value from list tag version when implemented.

>>> mf = MediaFile('/home/juniorjpdj/temp/Szpadyzor crew.flac')
>>> mf.mb_artistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.save()
>>> mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ARTISTID']
['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
>>> mf.mb_artistid = "test"
>>> mf.mb_artistid
'test'
>>> mf.save()
>>> mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ARTISTID']
['test']
>>> mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ALBUMARTISTID']
['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']

@JuniorJPDJ
Copy link
Contributor Author

JuniorJPDJ commented May 21, 2021

I just renamed lists to plural forms and added .single_field() fields on singular fields.
It looks like even compatibility with behaviour shown above is kept!

>>> mf.mb_albumartistids
['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid = '164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistids
['164ed128-525d-4798-883f-be13502de86b']

I also rolled back tests as my changes are not needed anymore as main tags work as before.

What do you think about it now?

EDIT: it also works for ID3v2.3 so it fixes weird behaviour:

>>> from mediafile import MediaFile
>>> f = MediaFile('/home/juniorjpdj/temp/ID3v230_Szpadyzor crew.mp3')
>>> f.mb_artistid
'164ed128-525d-4798-883f-be13502de86b'
>>> f.mb_artistids
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157',
 '680a88be-9f13-4487-add6-e0367841194d',
 'e5f3eaf9-f76b-4fee-ba4f-3208e53587c1']
>>> f.mb_albumartistids
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
>>> f.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> f.mb_artistids
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157',
 '680a88be-9f13-4487-add6-e0367841194d',
 'e5f3eaf9-f76b-4fee-ba4f-3208e53587c1']
f.mb_artistid = "kek"
f.mb_artistids
['kek']

@JuniorJPDJ JuniorJPDJ force-pushed the musicbrainz_multivalue branch 2 times, most recently from 53b3bec to c4218ff Compare May 21, 2021 23:19
@JuniorJPDJ
Copy link
Contributor Author

.

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! So sorry this took me so long to review in full detail, but this looks perfect. Thanks for sorting this out!!

@sampsyo sampsyo merged commit ed703f9 into beetbox:master Jun 27, 2021
@JuniorJPDJ JuniorJPDJ deleted the musicbrainz_multivalue branch June 27, 2021 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mb_artistid and mb_albumartistid should be lists
2 participants