Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced Timestamp Formatting and Custom Interval Grouping #50

Merged
merged 6 commits into from
Feb 13, 2024

Conversation

bscholer
Copy link
Contributor

@bscholer bscholer commented Feb 12, 2024

In this PR, I'm rolling out two significant updates to our timestamp handling:

  1. Automatic Timestamp Formats: I've introduced an auto timestamp format option, which is now the default setting. With this option selected, there's no need for manual selection between HH:mm:ss, mm:ss, or ss. The format adapts automatically: if the maximum segment timestamp exceeds 60 minutes, it displays in HH:mm:ss; for durations below 60 minutes, it switches to mm:ss. I chose not to switch to ss automatically because, from my perspective, it can seem unusual and be somewhat confusing. The option is still there for those that want it though.

  2. Custom Timestamp Intervals for Issue 45: Tackling issue Time-Stamp intervals are too short #45, I've implemented the ability to group segments by custom timestamp intervals, though this feature is initially turned off. This method intelligently calculates the start and end times for these groups. For instance, if you're working with a 15-second interval, but the actual speech is only from 00:04 to 00:13, the timestamp will smartly be shown as 00:04 - 00:13 for that group. However, it's important to highlight that the following group will proceed to consider the 00:15 to 00:30 interval, irrespective of the precise end time of the last segment from the preceding group. This feature is compatible with segments from both Whisper ASR and Swiftink, but it shines with word-level timestamps from Whisper ASR. This is because with smaller segments, there's a significantly lower chance of them crossing into another interval, making the grouping more precise and true to the spoken content.

In summary, these updates introduce a new 'Timestamp Interval' setting and involve a rewrite of the segmentsToTimestampedString() function. These changes enhance flexibility in timestamp formatting and segment grouping, offering users a more intuitive experience.

@bscholer bscholer marked this pull request as draft February 12, 2024 18:28
@bscholer
Copy link
Contributor Author

Not quite ready to merge yet, will be wrapped up shortly.

@bscholer bscholer marked this pull request as ready for review February 12, 2024 18:51
@djmango
Copy link
Owner

djmango commented Feb 12, 2024

Looks good overall, think there's some conflicts with #49 though

@bscholer
Copy link
Contributor Author

I'll get them sorted this evening!

@bscholer
Copy link
Contributor Author

@djmango conflicts are fixed

@djmango djmango merged commit e29dc89 into djmango:master Feb 13, 2024
1 check passed
@djmango
Copy link
Owner

djmango commented Feb 13, 2024

Good stuff, will release update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants