-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
display submodule #28
Comments
FWIW, this issue is starting to pop up in jams marl/jams#93 wrt pretty-printing annotations. It might be good to sync on this, just to figure out what reasonable requirements would be. |
I think this issue was mainly for plots, not pretty-printing annotations, but the two are roughly related. |
"plots" meaning ... what? |
You know, plots. |
Er, plots of what? Scores? Annotations? I'd argue that score-plotting is best handled by collecting results in a dataframe and using seaborn. (Or other custom code; point is we'd likely be reinventing the wheel to do that in mir_eval). Annotation-plotting is a different story though. |
Annotations. |
Ok; that's why I ask. :) Note that by "pretty-printing" what I actually mean is notebook-embed and svg rendering. (There are several good reasons to want direct svg output for this, rather than going through matplotlib, but i digress.) It'd be good if we could minimize overlap in functionality/implementation on this. |
Dead feature? |
Not dead, but (IMO) it's more natural to fit this into JAMS than mir_eval. |
Yeah, thanks! 😃 |
Reversing my prior comments on this after discussing in the jams threads: mir_eval should have some basic viz/plotting, and JAMS can decorate these with metadata/subplots/etc. Let's make a wish-list of how this should work?
Additional issues:
|
Thanks for starting this discussion.
Probably allow for either.
All sounds reasonable to me.
Sometimes a common plot is to do a time-frequency plot with different sources being different colors. It would definitely be cool to have an automated method to create plots like Figure 1d here http://hearingbrain.org/docs/mesgarani_chang_nature_2012.pdf
Not worth it, I don't think. Same for key.
Sounds good.
As a first pass, I am fine with forcing the user to handle subplots. This seems like something you will have to think about more in JAMS, because you will potentially have N things to compare. |
Again, same as sonification this needs to be separated into notes and continuous f0 (pitch contour). For the note case I'd go with overlapping piano rolls as in this gist I cooked up a while back for a potential JAMS viz module. For pitch contour continuous curves on a log-frequency scale (I think a cent scale would be most informative) sounds good to me. We'll have to think carefully about styling, since comparing curves with a large amount of overlap visually can get a little tricky (so alpha / color / thickness are things we'll have to play with). Re: convention, for the JAMS example it actually takes a list of annotations and visualizes all of them (without any constraints on the length of the list). Being able to visualize 1, 2, or N annotations together would be my preference, though it might not fit into the ref/est paradigm of |
I'd rather stick to one or the other, since they're pretty different from an api perspective.
Oooooh, those are slick! Really difficult to automate well, but I have some thoughts on how to do it. The trick will be getting it to work with more than two sources and not look like clown barf. Jotting implementation thoughts for how this can work:
Agreed. Time-varying versions of this can dump into the disjoint interval plotter.
Right. As input, each viz function should accept a target axis, or create a new one if none is provided. Each viz should return the constructed axis. |
Good point, I'd forgotten about piano-roll. I like your version of it; the comparison can be done easily by plotting the reference first, then the estimate on the same axes, and giving both alpha values.
Yup. The styling, I'd imagine, should be set by pass-through kwargs, but we should try to cook up some good defaults.
That, I think, is a separate issue. In JAMS, I'd be happy to just stack up the entire list of annotations vertically, since they'll generally be heterogeneous anyway. Ref-vs-est would be a separate function (functions?) on top of the core plotting routines that make head-to-head comparisons easier. |
The main difference in API is that for vlines you have to specify the top and bottom, right? I would be fine with using the supplied axis' ylim, as long as that behavior was specified, and defaulting to dots. Or, yeah, we can do one or the other.
This is roughly what I was thinking too.
Yes.
and
This makes sense for JAMS and not |
sgtm. once the viz is coded up (sonification is mostly there I guess) I think it'd be really nice for the community if we created a notebook illustrating these functionalities using both mir_eval and JAMS. Maybe even an LBD. |
Source separation viz demo .. not too hard to get about 85% of the way there.
I'm into this idea. Because we don't have enough things on our plate as is. |
Notebook updated to support optional text labels in disjoint intervals, and pre-set reference labels for overlapping intervals (not demoed in the notebook currently, but functional). Short-term to-dos:
Longer-term to-dos:
|
Notebook updated for overlapping interval comparisons with inequal label sets. Property cycling is still a little broken here, but it's good to know that the basic idea works. Double-update: it can now infer reference labels from pre-existing tick markers. This is somewhat brittle because detecting empty ticks in matplotlib isn't easy. It's certainly possible to break it (say by overlapping a plot on top of another plot with no ticks), but I can't imagine it would happen frequently or in meaningful scenarios. |
Agreed here.
It seems to me like a logical and sane way to handle this for chords (where we can have huge, but structured, vocabularies) would be to have color mapped to chord root and vertical position mapped to chord type. The same twelve colors would always correspond to the twelve roots, and for example in a min/maj vocabulary, we'd have two vertical positions. The number of possible chord types could be inferred at call time or user-specified, and would have a consistent ordering (major on the bottom, then minor, then seventh, etc). Does this seem reasonable? I don't think we need a separate row for each chord type, as your example currently shows. Otherwise looking good to me.
This would be very cool, but we should focus on nice visualization for the baseline of what people have installed first. |
I think a simpler version of this is to just reduce the color property to depend only on the root (+ no-chord) and leave the rest of the logic (lexicographic sorting) as is. I've already dropped the hatching cycler from the overlap-interval plot, since vertical position is enough to disambiguate and even color is just eye-candy. Some potential snags I see here:
I think we do. If you want them plotted together, they should be reduced first by a helper routine. In general, determining "equivalence" between different labels is a way bigger problem than we should hope to resolve within a plotting function. Is "A" the same as "A/3"? What about "verse" and "verse'"? It's an endless rabbit hole. |
Yes, I think so. Chords are different enough from segments (mainly because of potential vocab size) that I think this is reasonable; they can use the same core routine but chord can have additional logic for e.g. coloring things by root.
I think grouping (assuming by "grouping" you mean putting things in the same vertical row) by root is totally sane and probably expected behavior. Lexical sorting after grouping by root sounds fine to me.
I think I wasn't clear, what I meant was each different chord. To try to get on the same page, your example shows
I think it would be more sane if it was
where all things labeled Of course, if you decide it works best having a completely separate row for each individual chord label, I think that could work OK too, it will just have to be a very tall plot to get all of the labels visible on the y axis, and might be hard to follow. |
Okay. In that case, let me finish up a solid prototype of the current general-purpose plotter and then we can figure out how to build a chord plotter on top of it. FWIW, I'm thinking much more generally than "segments" here, and designing more for things like time-varying labels, which can also have a large and open vocab.
If by "row" you mean "block of rows", then yes. I don't expect/wouldn't understand how to read a plot with different qualities on the same root-oriented row.
I would have a much harder time reading the second one than the first. It would also require a lot more visual style differentiation to separate "C" from "D" and "E". I would expect something more like
where chords sharing the same root have a common color. This is essentially what we have now (except the color) -- your first example is not how the current implementation works.
I'm not sure that it would. Most songs don't have that many unique chords, and if we provide reduction mechanisms (say, collapse to 25 or 25+sevenths, etc), it would probably retain the info most people care about and still be simple enough to parse. I'd be curious for @ejhumphrey's input here, since he knows a thing or two about chords. |
My vote would be to place the label text in the middle of the rectangle (both vertically and horizontally) - I think it'd be the easiest to see this way. This might be overkill but an optional parameter for setting the label size might also be appreciated. |
You can pass through arguments (eg font size) to the annotation object through I'm not a fan of centering the annotation inside the patch for two reasons:
|
Then I'd go with NW |
Don't you think there are at least
I think this is the important question and is where we see things different (resulting in different visualization requirements), really. Can you do some quick JAMS wizardry to get the statistics of the number of chord labels per song across the datasets you guys have converted?
Yeah, this would be good but it's a big if - do intuitive mechanisms exist? Can we also indicate when a reduction has taken place in a reasonable way? These are all questions I can't answer but @ejhumphrey probably can. |
I'll just add to the mix that I don't think it's reasonable for a user to memorize a mapping from 12 colors to 12 chords (even if the colors are easily distinguishable), even with a legend at hand. This is a bit of a side note, but I think it'd be cool to try and target these visualizations not just at MIR experts (or generally people who are good at parsing graphs). Looking into the future, I can see a mir_eval/JAMS based annotation collection tool targeted at amateur musicians; providing something they can make sense of would be neat. |
Sure, but the explicit values are much less important than just being able to compare them. The most important operation when looking at the graph will be "Are these two chords the same?" both when comparing two (ref/est) graphs and when looking at a single one. I mean, you could say the exact same thing about vertical positioning of different chords - of course we don't expect people to memorize a mapping between height and N different chords either. In this case, we just want them to be able to see that the chords are different (because they have different vertical heights), and they can refer to the y labels as needed. |
Yes, as long as labels are supported it should be ok. I actually disagree about the values being much less important - for a meaningful error analysis you really want to see how labels get misclassified. But perhaps I'm going off topic :) |
That's not going to translate to print/colorblind-friendly modes very easily.
Who needs jams when we've got shell scripts? Here's the histogram on beatles: The max is 30 (Straweberry fields forever) and is only that high because of inversions. I'm too lazy to script something that squashes inversions, but i imagine a more reasonable upper bound is something like 15.
Wat? I'm imagining something very explicit, like mir_eval.display.chords(intervals, mir_eval.chord.reduce_majmin(labels))
I'm not sure I agree. Easily spotting identical chords is important, but it's also important to be able to see patterns of repeating progressions. |
One more comment that maybe got lost in the mix: Using color to indicate identity basically means that we can't plot two annotations on top of each other, and I'd really hate to lose that ability. |
Yes, of course.
In that case it seems probably ok to do vertical spacing. Though 30 is a lot of y labels. Even 15 is going to mean a very tall figure though when you want the labels to be legible. We could maybe do some kind of hierarchical labeling like I do in Figure 1(h) but that would be hard to do cleanly.
For sure, I think you could see it either way! Colors show pattern too...
Assuming that you mean plotting two annotations on the same axis with alpha, color works fine for this as long as no combination of colors appears in the colormap, which seems true in general, especially for colormaps where the lightness is uniform. But anyways, I can live with the vertical case. I might just be a pain in the ass about y labeling. |
Why? ("sonic-visualizer does it" is not an acceptable answer) |
Because that's closest to how musicians annotate chords in charts and I find it more intuitive than the other 3 corners. |
Ah, yeah, I was thinking just being able to tell whether the two annotators agreed, not which annotator a chord came from. I can understand if that's something that should be preserved. |
Seems to me like this conversation went way off the rails here.. funny how chords always seems to do that. More importantly: from an API perspective, is there anything that seems strange? Anything that should be exposed/controllable? (Embedded annotations are an obvious one, anything else?) |
The API seems reasonable to me. I think the functions could be named a little more clearly, i.e. overlap_intervals is useful even for disjoint intervals when they are dense (chords). I might expect people to want to supply custom kwargs to
or something. |
Agreed. Any suggestions?
You already can;
Yeah, I'm not sure what a better name would be though. |
Not really... maybe
Ah, I see that.
|
How about Unrelatedly: I think it actually makes sense to drop property cycling entirely from the labeled-interval display. I think most of the time, what you'd actually want is annotation-level coloring, since height differentiates label. If you have a fixed style for the entire annotation, then overlying two annotations becomes trivial, and intuitively works the way you'd expect most matplotlib routines to. Similarly, for the segment display, I think it makes more sense to rely on the axes internal fill style cycler, rather than roll our own. This way, if you draw multiple segmentations on the same plot (eg for hierarchies), the cycler maintains state naturally. It also punts the styling decisions upstream to the matplotlib style sheets, so again, things work in a manner which is consistent with the rest of MPL. I'll have a PR ready shortly, then we can move the discussion over there.
ehh.. |
Moving discussion to #196. |
For making useful qualitative plots for eval-by-eye
The text was updated successfully, but these errors were encountered: