Basic lyrics support #354

adrianholovaty · 2024-09-11T20:09:22Z

Just to get a discussion going, here's a rough first draft of what lyrics support in MNX might look like.

Obviously there's a lot to lyrics — formatting, language encoding, extender (melisma) lines, non-word sounds such as MusicXML's <laughing> and <humming>. I'd like to design this in a way that gets us a solid foundation that we can elegantly extend later, rather than getting forever bogged down in obscure features.

For prior art, see MusicXML's approach to lyrics:
https://w3c.github.io/musicxml/musicxml-reference/elements/lyric/

My approach here is to sketch out a simple example, then provide some thoughts on how it could be extended to support more obscure aspects of lyrics.

Proposed simple example

[
  {
    "type": "event",
    "duration": {"base": "quarter"},
    "lyrics": {
      "lines": [
        {"text": "Are"}
      ]
    },
    "notes": [
       {"pitch": {"octave": 5, "step": "C"}}
    ]
  },
  {
    "type": "event",
    "duration": {"base": "quarter"},
    "lyrics": {
      "lines": [
        {"text": "you"}
      ]
    },
    "notes": [
       {"pitch": {"octave": 5, "step": "D"}}
    ]
  },
  {
    "type": "event",
    "duration": {"base": "quarter"},
    "lyrics": {
      "lines": [
        {"text": "sleep", "hyphen": "start"}
      ]
    },
    "notes": [
       {"pitch": {"octave": 5, "step": "E"}}
    ]
  },
  {
    "type": "event",
    "duration": {"base": "quarter"},
    "lyrics": {
      "lines": [
        {"text": "ing?", "hyphen": "end"}
      ]
    },
    "notes": [
       {"pitch": {"octave": 5, "step": "C"}}
    ]
  }
]

A pretty basic starting point. Lyrics would live on events. Each event would have an optional "lyrics" key, which would be a lyrics object. Its "lines" key would be an array of lyric syllable objects. Each lyric syllable is required to have a "text". (Eventually I could envision loosening this: each lyric syllable would be required to have a "text" or something like "laugh": true.)

Hyphens

The hyphen between "sleep" and "ing?" in that example is encoded as "hyphen": "start" and "hyphen": "end" on the two lyrics. This is inspired by MusicXML's syllabic element and syllabic data type. The default (if not provided) would be "single".

Multi-line lyrics

Multi-line lyrics would be supported via multiple objects in a single event's "lines" array.

[
  {
    "type": "event",
    "duration": {"base": "quarter"},
    "lyrics": {
      "lines": [
        {"text": "Are"},
        {"text": "Am"},
      ]
    },
    "notes": [
       {"pitch": {"octave": 5, "step": "C"}}
    ]
  },
  {
    "type": "event",
    "duration": {"base": "quarter"},
    "lyrics": {
      "lines": [
        {"text": "you"},
        {"text": "I"}
      ]
    },
    "notes": [
       {"pitch": {"octave": 5, "step": "D"}}
    ]
  },

There would need to be a way for a given event's lyric lines to "skip" a position. For example, see this old engraving of "My Way," in which there are three lines, then only the top line, then only lines 2/3:

Pretext for lyric syllables

A common practice is to display a verse number directly within the lyrics, such as the "2." here:

This could be encoded on the lyric line object as "pretext" (better name likely needed!).

{"text": "I", "pretext": "2."}

Obviously it's possible to just encode this as {"text": "2. I"}, but it's much nicer to split out the pretext. That gives the consuming application more possibilities in lyric display. Plus the engraving is subtly different: the text should be centered under the note, as if the pretext doesn't exist.

Another example:

[
{"text": "be", "pretext": "(SALLY:)"},
{"text": "see", "pretext": "(BEN:)"}
]

Lyrics above the staff vs. below

Sometimes a single staff (multivoice, but not necessarily) contains lyrics both above and below the staff:

This could be encoded on the lyric line object:

{"text": "You", "placement": "above", "pretext": "SUE:"}

Perhaps we could also support setting the default placement on a sequence — that way it's only set in a single place.

We'll need to support multiline lyrics above the staff as well.

Non-lyric text that gets rendered with lyrics

It's also reasonably common in pop music to engrave non-lyric text as one of the lyric lines. It's more of an instruction, but it's important to engrave it in the lyrics area. For example:

This could be encoded as a lyric line with an empty "text" (?) and some other object, perhaps called "aside". There's also a need to specify "though this text is rendered in the lyrics space, the default lyrics spacing algorithm does not apply to it" — for which I've used "detached": true here.

[
  {
    "text": "Did",
    "pretext": "1."
  },
  {
    "text": "",
    "pretext": "2. 3.",
    "aside": "(see additional lyrics)",
    "detached": true
  }
]

Other data in the lyrics object

Apart from "lines", the lyrics object would contain data about the event's lyrics in general (that is, not referring to any of the specific lines). One thing that comes to mind is encoding a bracket before a particular event's lyrics, like this:

This could be encoded as "startBracket", with an optional size (measured in number of lyric lines?):

{
  "lyrics": {
    "startBracket": {
      "size": 2
    },
    "lines": [
      {"text": "your"},
      {"text": "my"}
    ]
  }
}

Formatting

Finally, a quick thought on formatting. I think font choices shouldn't live at this low level, as that would lead to a lot of duplication. There should be a way to set the lyric font for the document in a single place, perhaps in a layout.

The text was updated successfully, but these errors were encountered:

mscuthbert · 2024-09-11T20:56:57Z

Wow! This is beyond "Basic" lyrics support. Super well thought through!

Quick-thought comments:

hyphen: can we keep something like the musicxml syllabic or rename to wordPart? I think that hyphen should be reserved specifically to preserve information about the hyphen itself (should it be shown? repeated over melismas? etc.) -- something to discuss later (semantic vs. presentation, etc.).

pretext: cool idea! I do think we need to distinguish between the 2. I graduated from pretext case and the (SALLY:) pretext case. Why is there a 2. w/ the I of I graduated from? Easy: it indicates the start of a new verse. But why is there a (SALLY:) before be_____? Because it's the start of a new system or maybe the start of a new page. If we moved the previous measure forward or this measure backwards, presumably the position of (SALLY:) would move with it. Therefore I think that we need something like a pretextRegion concept where the first syllable of the region would get SALLY: and any future syllable that (depending on how configured) began a system or page would get (SALLY:) [scrolling applications too, etc.] until the end of the pretextRegion.

aside and detached are super helpful and something missing in MusicXML. I think that this also would go well with anything relating to the successors to <humming> and <laughing> (add cough, tongue-click,...) so that the text can display as an aside: "mmmmm" but not have any vocal-processing VST try to find sounds for m.

I don't think that we'll get away with omitting a number or similar tag (NMTOKEN in MusicXML) and having it be a list would be ideal. Look at encoding the Sal-ly rest-ing in {your/my} how does the processor know that Sally resting in belongs to both the your and the my lines? (And from a semantic perspective, it would be good to know that so that the layout is shifted to the middle for semantic reasons not merely as layout.

The number or similar tag, should probably not be called number but more generic like lyricId or lyricGroup -- Looking at the examples above there is a difference between the 1. First/2. Second lyrics and the (SALLY:) be/(BEN:) see examples. The 1./2. lyrics should not be sung at the same time, while the SALLY/BEN lyrics should probably both be sung at the same time even if there are no repeats. (a representation system that doesn't allow multiple performers on a single staff would probably insist that these be encoded as separate staves, but I hate the idea that we have to encode the same notes (and possibly same lyrics) twice when they should only be displayed once).

I am so glad that JSON will allow the time-only/timeOnly specification to just be an array of ints rather than a regex.

samuelbradshaw · 2024-09-12T00:51:13Z

pretext: I like this concept, but I don't like the name (because "pretext" is a word with an unrelated meaning). How about something like "label" (I think this is what MEI uses)?

pretext feels similar semantically to name and shortName on a part. Should parts and lyrics follow similar patterns for a label/prefix/name/abbreviation?

Do both start and end need to be specified? Would it be simpler to just have a boolean flag on every syllable that indicates whether there's more of the word ahead (false if not specified)? Something like this (though I'm not confident on the property name):
{"text": "Are"} … {"text": "you"} … {"text": "sleep", "continues": true} … {"text": "ing?"}

I agree with @mscuthbert some way to tie a set of syllables together as being part of given verse(s) or group(s) is needed. For me, a use case would be showing and hiding verses in a hymn with several verses, possibly with a chorus that would always be visible.

Should asides always be detached (if so, is the detached property needed)?

Do we need to consider a single line of lyrics tied to more than one part?

Do you think this will be flexible enough to cleanly handle multiple syllables on a single note, or multiple notes for a single syllable? Or the first verse with two syllables on a note and the second verse with only one syllable on a note?

mscuthbert · 2024-09-12T00:59:26Z

Should asides always be detached (if so, is the detached property needed)?

The detached property may be needed for when an engraver does not mind that a syllable overlap with the next note. There are times when a syllable is long (e.g. "sixths") but the next note or two doesn't have a lyric, and overlap is acceptable. (Or where the next note only has a syllable in another verse). Maybe it's a three-way: inherit, yes, no.

My feeling with MNX is that since we are not optimizing for ease of human writing of json is keep them separate. There are two separate concepts (aside: is it something in the "lyrics" space that is not to be sung; detached: should this lyric not affect layout) that happen to be highly correlated.

I had some of the same misgivings about pretext but label could be too generic and doesn't specify the placement. What about before to align with the CSS naming?

samuelbradshaw · 2024-09-12T01:31:14Z

Do you think this will be flexible enough to cleanly handle multiple syllables on a single note, or multiple notes for a single syllable? Or the first verse with two syllables on a note and the second verse with only one syllable on a note?

For multiple syllables in a note, I guess you could just put the whole thing in the text property:
{"text": "Are you"} … {"text": "sleep", "continues": true} … {"text": "ing?"}
{"text": "Are"} … {"text": "you"} … {"text": "sleeping?"}
{"text": "¿Qué ̮es"} … {"text": "lo"} … {"text": "que"}

For multiple notes on a syllable:
{"text": "Are"} … {"text": "you"} … {"text": "slee", "continues": true} … {"continues": true} … {"text": "ping?"}

lemzwerg · 2024-09-12T04:41:30Z

I like the 'continues' idea since there are languages like Chinese that have multi-syllable words but no hyphens inbetween (if written with CJK characters).

lemzwerg · 2024-09-12T04:45:37Z

There might also be dense typesetting situations where hyphens are not shown at all. However, sometimes a hyphen should be enforced even in such dense typesetting, and for such situation there should be an explicit 'hyphen' object.

In other words, I consider using 'continues' as a helpful means to separate content from layout.

cyrilcoutelier · 2024-09-12T07:23:10Z

I think that the lines array might not be enough to account for gaps in the verses. For instance a note that would only have lyrics for the 2nd verse. Or another that would have lyrics for verse 1 and 3.

…

On Thu, Sep 12, 2024 at 6:46 AM Werner Lemberg ***@***.***> wrote: There might also be dense typesetting situations where hyphens are not shown at all. However, sometimes a hyphen should be enforced even in such dense typesetting, and for such situation there should be an explicit 'hyphen' object. In other words, I consider using 'continue' as a helpful means to separate content from layout. — Reply to this email directly, view it on GitHub <#354 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTEWBA4ZTOT4THRL6SFRDZWEMATAVCNFSM6AAAAABOBWVJV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBVGI3DENRWHA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

williamclocksin · 2024-09-12T07:39:50Z

I assumed that the lyrics were associated with the note, and the lyric syllable would know which verse it is in, so the lines array is simply an additional data structure that has the syllables of the line on it (useful to scan through all the syllables on a line to put the hyphens and melismas in the right places) but also useful information pertaining to the line itself such as line position and spacing.

adrianholovaty · 2024-09-12T07:47:32Z

Responding to @samuelbradshaw:

Do both start and end need to be specified? Would it be simpler to just have a boolean flag on every syllable that indicates whether there's more of the word ahead (false if not specified)? Something like this (though I'm not confident on the property name): {"text": "Are"} … {"text": "you"} … {"text": "sleep", "continues": true} … {"text": "ing?"}

I agree that would be simpler and perhaps better. In fact that's how our own internal format at Soundslice works — a lyrics syllable simply has a boolean that answers the question "Is this syllable meant to continue into the next one?" That's worked well enough for us for 7+ years of lyrics support. And we support non-trivial stuff like lyrics-only view.

I don't immediately see the need for "end" — i.e., knowing whether a syllable is meant to finish the previous syllable. It introduces an opportunity for error and ambiguity (if the first syllable has "start" but the second syllable lacks "end"), and I don't know what the benefit is. Anybody have a clear use case for encoding both the "start" and "end" as opposed to just the "start"?

mscuthbert · 2024-09-12T18:24:37Z

Chairs meeting noted that we need a way of encoding line-break (or start-of-line) for being able to extract lyrics to poetic blocks. (e.g., https://www.soundslice.com/slices/bwZcc/ -- then click microphone at bottom right)

samuelbradshaw · 2024-09-12T18:39:40Z

I would very much like improvements that make it easier to extract lyrics into words, phrases, and poetic blocks. Some of the challenges I'm aware of currently are:

Knowing where a lyric phrase starts or ends (new line)
Knowing where a lyric block (verse / chorus / etc.) starts or ends (double new line, or new paragraph)
Knowing where to put spaces (in English you can assume spaces between words, but that's not true in all languages)
Knowing where to put hyphens (the music has hyphens between every syllable, so it's tempting to strip them all out for a lyrics view, but some words are hyphenated as part of their spelling, in English and other languages)
Handling choruses that only appear in the music once, but should be repeated after every verse in a lyrics view
Following repeats and jumps generally (you can't just extract all of the syllables in the order they appear in the music)
Handling places where different voices sing different lyrics simultaneously

If the new line is just part of the text, I think the standard way to encode that in JSON is with two slashes: \\n (see https://stackoverflow.com/a/42073). But maybe there's a better way to semantically indicate phrase breaks and block breaks that doesn't involve new-line characters.

adrianholovaty · 2024-09-26T16:02:12Z

I've put pull request #355 together, which adds basic lyrics support to the MNX spec. I just wanted to get the "bones" of it in — lyric lines and hyphens — as opposed to dealing with the various special cases discussed in this thread. The most important thing is that the basic structure of the encoding will be forwards-compatible with the various special cases. Feedback welcome!

williamclocksin · 2024-09-27T04:07:50Z

One place I frequently used a start-end indicator in Calliope (though start-end was calculated at display time rather than in the file contents) is when a continuation (either hyphen or melisma) extended over more than one system, even as many as three systems. In florid monody of the early C17, it is common to see this.

adrianholovaty · 2024-10-24T18:11:49Z

All right, as of 230dab7 we now have basic lyrics support! I'm closing this issue to keep things focused. We'll open separate issues for the various other lyric cases, as @samuelbradshaw has done with #357.

adrianholovaty mentioned this issue Oct 10, 2024

Added spec for lyrics and lyric lines, with two new example docs #355

Merged

samuelbradshaw mentioned this issue Oct 11, 2024

Encoding spaces, hyphens, phrases, and blocks in lyrics #357

Open

adrianholovaty closed this as completed Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic lyrics support #354

Basic lyrics support #354

adrianholovaty commented Sep 11, 2024

mscuthbert commented Sep 11, 2024

samuelbradshaw commented Sep 12, 2024 •

edited

Loading

mscuthbert commented Sep 12, 2024

samuelbradshaw commented Sep 12, 2024 •

edited

Loading

lemzwerg commented Sep 12, 2024 •

edited

Loading

lemzwerg commented Sep 12, 2024 •

edited

Loading

cyrilcoutelier commented Sep 12, 2024 via email

williamclocksin commented Sep 12, 2024 via email •

edited by adrianholovaty

Loading

adrianholovaty commented Sep 12, 2024

mscuthbert commented Sep 12, 2024

samuelbradshaw commented Sep 12, 2024 •

edited

Loading

adrianholovaty commented Sep 26, 2024

williamclocksin commented Sep 27, 2024 via email •

edited by adrianholovaty

Loading

adrianholovaty commented Oct 24, 2024

Basic lyrics support #354

Basic lyrics support #354

Comments

adrianholovaty commented Sep 11, 2024

Proposed simple example

Hyphens

Multi-line lyrics

Pretext for lyric syllables

Lyrics above the staff vs. below

Non-lyric text that gets rendered with lyrics

Other data in the lyrics object

Formatting

mscuthbert commented Sep 11, 2024

samuelbradshaw commented Sep 12, 2024 • edited Loading

mscuthbert commented Sep 12, 2024

samuelbradshaw commented Sep 12, 2024 • edited Loading

lemzwerg commented Sep 12, 2024 • edited Loading

lemzwerg commented Sep 12, 2024 • edited Loading

cyrilcoutelier commented Sep 12, 2024 via email

williamclocksin commented Sep 12, 2024 via email • edited by adrianholovaty Loading

adrianholovaty commented Sep 12, 2024

mscuthbert commented Sep 12, 2024

samuelbradshaw commented Sep 12, 2024 • edited Loading

adrianholovaty commented Sep 26, 2024

williamclocksin commented Sep 27, 2024 via email • edited by adrianholovaty Loading

adrianholovaty commented Oct 24, 2024

samuelbradshaw commented Sep 12, 2024 •

edited

Loading

samuelbradshaw commented Sep 12, 2024 •

edited

Loading

lemzwerg commented Sep 12, 2024 •

edited

Loading

lemzwerg commented Sep 12, 2024 •

edited

Loading

williamclocksin commented Sep 12, 2024 via email •

edited by adrianholovaty

Loading

samuelbradshaw commented Sep 12, 2024 •

edited

Loading

williamclocksin commented Sep 27, 2024 via email •

edited by adrianholovaty

Loading