Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding spaces, hyphens, phrases, and blocks in lyrics #357

Open
samuelbradshaw opened this issue Oct 11, 2024 · 0 comments
Open

Encoding spaces, hyphens, phrases, and blocks in lyrics #357

samuelbradshaw opened this issue Oct 11, 2024 · 0 comments

Comments

@samuelbradshaw
Copy link

samuelbradshaw commented Oct 11, 2024

There are several challenges that make it difficult to extract lyrics from sheet music in a readable form. Among them:

  • Knowing where a lyric phrase starts or ends
  • Knowing where a lyric block (verse / chorus / etc.) starts or ends
  • Knowing where to put spaces (in English you can assume spaces between words, but that's not true in all languages)
  • Knowing whether a hyphen is part of the word, or a soft hyphen that can be shown or hidden as needed

Not all applications need a way to extract lyrics from sheet music. However, extracting lyrics is helpful for music viewers where users have varying musical experience. It can be used to generate guitar lyric sheets (lyrics lined up with chords). I also predict it will become important in the future for computers to be able to extract well-formatted lyrics from sheet music behind the scenes for things like generated singing (intersection between MIDI and text-to-speech).

I'd like to propose [Proposal 1] that spaces and the two types of hyphens be explicitly encoded in the sheet music. The syllables in a phrase like "Sun-dried raisins? Yes!" might look like this:

{ "text": "Sun-" } … { "text": "dried " } … { "text": "rai•" } … { "text": "sins? " } … { "text": "Yes!" }

Notice that the original hyphen and spaces are preserved. "•" is used to indicate a soft hyphen (following the syntax in dictionary definitions, where "•" is placed between syllables). Alternatively, it could be something like this (a little more verbose):

{ "text": "Sun" "hyphen": "grammatical" } … { "text": "dried " } … { "text": "rai", "hyphen": "discretionary" } … { "text": "sins? " } … { "text": "Yes!" }

Or (even more verbose):

{ "text": "Sun" "hyphen": "grammatical" } … { "text": "dried", "suffix": " " } … { "text": "rai", "hyphen": "discretionary" } … { "text": "sins", "suffix": "? " } … { "text": "Yes", "suffix": "!" }

In issue #354 and pull request #355, two paradigms were discussed for managing syllables: start, middle, end, and whole syllable types (borrowed from MusicXML) or continues or hyphen drawing instructions. I'd like to advocate [Proposal 2] that we use the continues/hyphen paradigm (sorry I've gone back and forth on this).

What MusicXML does is semantic (usually good) – but only in alphabetic writing systems. In many Asian languages, you have to "break the rules" to get the sheet music you want. For example, in Chinese, each character is equivalent to a syllable. Most Chinese words are composed of two characters. Following the MusicXML pattern, it would be intuitive to mark the first character of a word as start, and the second character as end. But there's a problem – Chinese sheet music isn't drawn with hyphens between syllables. So, you have to mark each character as whole (neither semantic nor intuitive).

One of the arguments for sticking with the MusicXML pattern is because knowing start and end can help a graphics engine provide enough space between words. I think [Proposal 1] can meet this need by preserving spaces, which will naturally keep separate words apart.

Finally, I think it would be helpful to add [Proposal 3] attributes that indicate the end of a lyric phrase and/or lyric block. Something like this:

{ "text": "Sun" "hyphen": "grammatical" } … { "text": "dried " } … { "text": "rai", "hyphen": "discretionary" } … { "text": "sins? " } … { "text": "Yes!", "endPhrase": "true", "endBlock": "true" }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants