Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Word-Level Alignment #215

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
11b3ec5
Clean up tuple unpacking
ibevers Dec 5, 2024
f0e0f6f
Update to calculate character timestamps
ibevers Dec 11, 2024
bbac27f
Start implementation with ScriptLine
ibevers Dec 11, 2024
4a1c447
Merge branch 'main' into 214-task-word-level-alignment
ibevers Dec 11, 2024
cb6d965
Update timestamping to use dicts
ibevers Dec 11, 2024
e8c7bb3
Update _assign_timestamps_to_characters to fully align
ibevers Dec 13, 2024
525110a
Clean up vestigial code and comments
ibevers Dec 13, 2024
ed88772
Remove unused args
ibevers Dec 13, 2024
47442df
Update ScriptLine conversion to use from_dict
ibevers Dec 13, 2024
f177a29
Update comments and fix mypy errors
ibevers Dec 17, 2024
f9c9b13
Clean up long comment
ibevers Dec 17, 2024
eef0734
Fix 2 broken tests
ibevers Dec 17, 2024
ded8325
Remove vestigial test
ibevers Dec 17, 2024
bbd4107
Add fixture with real alignment for comparison
ibevers Dec 18, 2024
fdaa824
Add had that curiosity audio fixture
ibevers Dec 18, 2024
7f8ebd3
Add test with had_that_curiosity audio
ibevers Dec 18, 2024
816fb76
Adjust curiosity test
ibevers Dec 19, 2024
060cf4e
Replace segment level whisper timestamps with more accurate alignment…
ibevers Dec 19, 2024
5934adf
Add forced alignment alignment difference check function
ibevers Dec 19, 2024
94e351e
Improve evaluation function name
ibevers Dec 19, 2024
eda1af0
Update alignment comparison to use lower case
ibevers Dec 19, 2024
88cbd7b
Add compare_alignments to curiosity test
ibevers Dec 19, 2024
754b1fe
Rename assign timestamps function
ibevers Dec 19, 2024
0429de0
Merge branch 'main' into 214-task-word-level-alignment
ibevers Jan 2, 2025
8d04540
Change default device type to CPU
ibevers Jan 2, 2025
8a7cdfa
Update test_align_transcriptions_fixture to transcribe the input audio
ibevers Jan 2, 2025
64a0011
Add aligned mono audio fixture
ibevers Jan 2, 2025
650e5f5
Remove redundant test
ibevers Jan 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions src/senselab/audio/tasks/forced_alignment/data_structures.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,6 @@ class TranscriptionResult(TypedDict):
language: str


class AlignedTranscriptionResult(TypedDict):
"""A list of segments and word segments of a speech."""

segments: List[SingleAlignedSegment]
word_segments: List[SingleWordSegment]


@dataclass
class Point:
"""Represents a point in the alignment path.
Expand Down
26 changes: 26 additions & 0 deletions src/senselab/audio/tasks/forced_alignment/evaluation.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see some useful scenarios for using compare_alignments. Thank you for implementing it!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, glad to hear!

Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
"""Provides alignment evaluation functions."""

from senselab.utils.data_structures.script_line import ScriptLine


def compare_alignments(alignment_one: ScriptLine, alignment_two: ScriptLine, difference_tolerance: float = 0.1) -> None:
"""Check if two alignments are within the specified difference tolerance.

Args:
alignment_one (ScriptLine): The first alignment segment.
alignment_two (ScriptLine): The second alignment segment.
difference_tolerance (float): Allowed difference in start and end times (seconds).

Raises:
AssertionError: If the start or end times differ by more than the tolerance.
"""
print(f"Texts: {alignment_one.text} | {alignment_two.text}")
if alignment_one.text and alignment_two.text:
assert alignment_one.text.lower() == alignment_two.text.lower(), f"{alignment_one.text} {alignment_two.text}"
if alignment_one.start is not None and alignment_two.start is not None:
assert abs(alignment_one.start - alignment_two.start) < difference_tolerance
if alignment_one.end is not None and alignment_two.end is not None:
assert abs(alignment_one.end - alignment_two.end) < difference_tolerance
if alignment_one.chunks and alignment_two.chunks and len(alignment_one.chunks) == len(alignment_two.chunks):
for a1, a2 in zip(alignment_one.chunks, alignment_two.chunks):
compare_alignments(a1, a2)
Loading
Loading