Refactor lyrics tests, do not search for empty metadata #5452

snejus · 2024-10-03T13:41:08Z

Description

Fixes #2635
Fixes #5133

I realised that #5406 has gotten too big, thus I'm splitting it into several smaller PRs.

This PR refactors lyrics plugin tests and fixes an empty metadata issue in the lyrics logic.

CI

Added --extras=lyrics to the Poetry install command to include the lyrics plugin dependencies.
In the main task which measures coverage, set LYRICS_UPDATED environment variable based on changes detected in the lyrics files.

Test setup

Introduced ConfigMixin to centralize configuration setup for tests, reducing redundancy. This can be used by tests based on pytest.

Lyrics logic

Trimmed whitespace from item.title, item.artist, and item.artist_sort in search_pairs function.
Added checks to avoid searching for lyrics if either the artist or title is missing.
Improved _scrape_strip_cruft function to remove Google Ads tags and unnecessary HTML tags.

Lyrics tests overhaul

Migrated lyrics tests to use pytest for better isolation and configuration management.
Deleted redundant lyrics text files and some unused utils.
Marked tests that should only run when lyrics source code is updated (LYRICS_UPDATED is set from the CI) using the on_lyrics_update marker.

Documentation and Dependencies

Added requests-mock version 1.12.1 to pyproject.toml and poetry.lock for mocking HTTP requests in tests.
Updated setup.cfg to include a new marker on_lyrics_update.

bal-e · 2024-10-05T15:41:55Z

.github/workflows/ci.yaml

+            beetsplug/lyrics.py
+            test/plugins/test_lyrics.py


I'm not entirely comfortable with this mode of testing-as-needed. These files necessarily reference many other parts of the codebase and it's always possible that they can break tests (even if they're pretty independent of the rest of the codebase right now, that can and probably will change in the future).

Currently these integration tests only run once a week, and no one finds out about failures 😟

CONTRIBUTING.rst

test/plugins/test_lyrics.py

snejus · 2024-10-18T21:25:39Z

@bal-e added a pair of additional commits as I noticed that LRCLib integrated tests have been testing no-op logic and failed once I configured it correctly.

I also added lyrics texts that we're expecting to receive from each of the sources we're testing. These will be helpful to have when changes are made in the backends' scraping logic.

Two google sources failed to return the expected output. I looked into each case why parsing failed: - lyrics on musica.com contain <aside> Google Ads - each lyrics line on lacoccinelle.net is wrapped within alternating <em> and <strong> tags Thus remove these tags as part of the HTML cleanup logic.

Create 'helpers.ConfigMixin' which sets up testing configuration. This is helpful for tests (e.g. test_lyrics.py) that only need the configuration and do not require temp dir. (#5102) Refactor lyrics tests to fix the issue global beets config issue. Additionally, add 'integration_test' mark that can be used to mark tests that should only run once a week.

- Replaced unittest.mock with pytest fixtures for better test isolation and readability. - Simplified test cases by using parameterized tests. - Added `requests-mock` dependency to `pyproject.toml` and `poetry.lock`. - Removed redundant helper functions and classes.

The test for GeniusLyrics was heavily patched and no longer provided useful coverage. It has been removed to clean up the test suite.

- Consolidated multiple test cases into parameterized tests for better readability and maintainability. - Simplified assertions by comparing lists of actual and expected artists/titles. - Added `unexpected_empty_artist` marker to handle cases which unexpectedly return an empty artist. This seems to be happen when `artist_sort` field is empty.

Modified `search_pairs` function in `lyrics.py` to: * Firstly strip each of `artist`, `artist_sort` and `title` fields * Only generate alternatives if both `artist` and `title` are not empty * Ensure that `artist_sort` is not empty and not equal to artist (ignoring case) before appending it to the artists Extended tests to cover the changes.

Since at least one Backend requires album` and `duration` arguments (`LRCLib`), the caller (`LyricsPlugin.fetch_item_lyrics`) must always provide them. Since they need to provided, we need to enforce this by defining them as positional arguments. Why is this important? I found that integrated `LRCLib` tests have been passing, but they called `LRCLib.fetch` with values for `artist` and `title` fields only, while the actual functionality *always* provides values for `album` and `duration` fields too. When I adjusted the test to provide values for the missing fields, I found that it failed. This makes sense: Lib `album` and `duration` filters are strict on LRCLib, so I was not surprised the lyrics could not be found. Thus I adjusted `LRCLib` backend implementation to only filter by each of these fields when their values are truthy.

Add explicit checks for lyrics texts fetched from the tested sources. - Introduced `LyricsPage` class to represent lyrics pages for integrated tests. - Configured expected lyrics for each of the URLs that are being fetched. - Consolidated integrated tests in a new `TestLyricsSources` class. - Mocked Google Search API to return the lyrics page under test.

snejus self-assigned this Oct 3, 2024

snejus requested a review from bal-e October 3, 2024 13:41

snejus force-pushed the lyrics-refactor-tests branch 4 times, most recently from 626bf91 to 273e39b Compare October 3, 2024 19:11

bal-e reviewed Oct 5, 2024

View reviewed changes

snejus force-pushed the lyrics-refactor-tests branch 3 times, most recently from 200cd19 to 9e67a56 Compare October 12, 2024 01:15

snejus changed the base branch from master to lyrics-fix-tekstowo October 12, 2024 01:39

snejus force-pushed the lyrics-refactor-tests branch 2 times, most recently from b4d78da to 038ca48 Compare October 12, 2024 01:48

bal-e reviewed Oct 12, 2024

View reviewed changes

test/plugins/test_lyrics.py Outdated Show resolved Hide resolved

test/plugins/test_lyrics.py Outdated Show resolved Hide resolved

Base automatically changed from lyrics-fix-tekstowo to master October 12, 2024 21:52

snejus force-pushed the lyrics-refactor-tests branch 2 times, most recently from 2fb6682 to d02e017 Compare October 12, 2024 22:10

snejus requested a review from bal-e October 12, 2024 22:18

snejus force-pushed the lyrics-refactor-tests branch 2 times, most recently from 54f06f9 to 19dff1c Compare October 18, 2024 21:14

snejus requested a review from jackwilsdon October 18, 2024 21:25

snejus force-pushed the lyrics-refactor-tests branch from 19dff1c to e70f61b Compare October 19, 2024 00:48

snejus added 7 commits October 30, 2024 19:22

Rewrite lyrics integration tests

fbca24c

Remove redundant lyrics test files

a0dcc77

Configure integrated lyrics tests to only run on lyrics code changes

d40b6f5

Remove outdated GeniusLyrics test

75e26e8

The test for GeniusLyrics was heavily patched and no longer provided useful coverage. It has been removed to clean up the test suite.

snejus added 8 commits October 30, 2024 19:23

Refactor test_slug to pytest

bfc23eb

Refactor utils test cases to use pytest.mark.parametrize

d7cc5c4

Remove pytest.param alias _p

51ced7e

Google: test the entire fetch method

350f233

snejus force-pushed the lyrics-refactor-tests branch from e70f61b to 278279e Compare October 30, 2024 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor lyrics tests, do not search for empty metadata #5452

Refactor lyrics tests, do not search for empty metadata #5452

snejus commented Oct 3, 2024

bal-e Oct 5, 2024

snejus Oct 5, 2024

snejus commented Oct 18, 2024

Refactor lyrics tests, do not search for empty metadata #5452

Are you sure you want to change the base?

Refactor lyrics tests, do not search for empty metadata #5452

Conversation

snejus commented Oct 3, 2024

Description

CI

Test setup

Lyrics logic

Lyrics tests overhaul

Documentation and Dependencies

bal-e Oct 5, 2024

Choose a reason for hiding this comment

snejus Oct 5, 2024

Choose a reason for hiding this comment

snejus commented Oct 18, 2024