Sync Audiobook and eBook progress #2091

KarmicIT · 2023-09-13T09:18:35Z

KarmicIT
Sep 13, 2023

Something I haven't seen in other audio/ebook applications is the ability to sync progress across formats. This is great for those who listen to an audiobook in the car but would like to switch to the ebook at home.

I imagine this would be fairly difficult to implement natively, so my idea would be to include the ability to enable connectivity via API with one or more 3rd parties offering speech-to-text (STT) services. These could be local apps/docker containers (like voice2json ) or cloud-hosted services (like Azure Speech Services) etc.

Pre Sync Workflow

ABS imports a new audiobook or runs a scheduled task over existing audiobooks.
ABS calls the required STT service API
The audiobook is submitted based on user-selectable options (full book, per-chapter, per-file)
STT service returns timestamped text back to ABS
ABS saves text as metadata in the audiobook folder

Sync Workflow

ABS determines its time to sync progress
ABS verifies the STT metadata is available for the book. If not, skip the book.
ABS determines which format has the "latest" time-sync to indicate the direction of sync
If the sync direction is audio->ebook
--- ABS looks up the current audio timestamp within the STT metadata to find the last sentence of text
--- If text is shorter than a few words then a new search from 5-10 seconds ago is performed
--- ABS then searches for that text directly in the ebook
--- ebook progress is updated to that location
If the sync direction is ebook->audio
--- ABS searches the STT metadata for the last sentence of text (must be greater than a few words) and gets the timestamp
--- ABS then sets audiobook progress to that timestamp

nichwall · 2023-09-13T14:01:19Z

nichwall
Sep 13, 2023

Hey there! This has been brought up a few times, but I think it would be a great addition (once more bugs and things are worked out).

#189

I started making an external utility at the end of last year if you want to check it out. I haven't worked on it since then, but have thought about it quite a bit (and discussed in the ABS Discord). Others have ideas for online utilities though.

https://github.com/nichwall/aesync

My current thought is pretty similar to yours, where there is minimal preprocessing because transcribing a book is compute intensive, but is still computed ahead of time. Basically, 3-5 second chunks are transcribed every 10-20 minutes in the book, and then that short snippet is matched to the ebook. Then, check for any sections that are longer than a threshold (due to not matching) and get snippets around where it should be until there is a match. This is mostly to account for books that have extra information that is read or ignored in the audiobooks (such as pictures or tables) or just bad matching due to made up words.

These matched timestamps are converted to a percentage for the library item, so it would work for both single file books and multiple file books. These percentages will be stored in a database for the utility (unless advplyr wants to integrate it directly into ABS) with the audiobook and ebook path. This way, whenever a sync is requested, if the mapping already exists, it is just pulled, and if it doesn't then it is able to be calculated pretty quickly and stored for future use.

The biggest outstanding issues for me

converting ebook text percentage to rendered percentage (due to books, tables, and page size)
building the fancy lookup database
testing on books

The main reason I stopped was ABS didn't store ebook progress yet, but it does as of July. I think the main thing left on the ABS side is storing the ebook and audiobook progress separately (advplyr/audiobookshelf-app#870)

I think ABS also shouldn't automatically sync the progress, but instead have 3 buttons: Sync to ebook, Sync to Audiobook, and Begin Matching. This could be more automated later, but just to get an experimental feature out there it would be pretty simple on the UI side. The reason I don't think it should automatically sync is if someone is reading along in the ebook while listening. This is a simple enough check to add, but could get complicated with open/closing things at different times.

EDIT: not that this is the best way, just my thoughts on a simple way to do it. Generating sync positions for Rhythm of War takes 2-5 minutes on my old laptop and is accurate to about 200 characters when picking random places to sync after making the sync positions.

Also the "only thing left on ABS side" is all my solution is missing to begin testing/implementation, but there's more pressing issues with the apps and server itself

4 replies

nichwall Sep 13, 2023

A limitation on my method would be that the sync could only occur when online, unless the "sync map" is added to the library item like you mentioned so it can also be downloaded to the mobile apps.

advplyr Sep 13, 2023
Maintainer

I was thinking that the sync map file would be generated ahead of time instead of on-the-fly. Plex recently rolled out a feature that detects tv series intro and credits so they can be skipped. They run that detection as a separate step in the scanner.

nichwall Sep 13, 2023

That's fair. My main reasoning was for users with large libraries, so the sync map is generated during the first time the item is played/read (and should be done before they finish that session, even if they only read a few pages) so it's still done ahead of time, but is nicer for people using less powerful servers and doesn't need to be done for all items during a scan, sort of like persisting a transcode.

Either way, I think having a "percentage map" function would make syncing simpler since no transcoding needs to be done when figuring out where in the book the person is for syncing.

I'll probably pick my side project back up sometime this month because I'm still curious of how good I can get it.

nichwall Sep 13, 2023

Also to be clear, I'm not saying my solution needs to be the chosen option. My side project started out as a playground to see if my idea was even "good enough", and to maybe eventually provide a utility ABS can just interface with like tone until ABS is ready to fully integrate it.

mr-ransel · 2023-11-13T16:05:32Z

mr-ransel
Nov 13, 2023

I was thinking about this and I went ahead and opened up #2308 specifically as the "lowest possible barrier to entry" version of this that requires no STT to function (using percentage of the way through chapters).

If you add even the slightest bit of STT of spots in audiobooks, like 5-6 data points in the book, a linear regression of them between the other books formats page numbers will get you very close estimates to the real value because:

It can factor in skipping all the padding (copyrights, appendix, table of contents) at the beginning and the end of you have data points for those bookends
Most authors are pretty consistent with their use of tables "interesting" formatting throughout books.

0 replies

mdb17 · 2024-10-09T15:33:08Z

mdb17
Oct 9, 2024

I was looking into if it was possible to sync up the reading progress of my reading and listening and stubbled across this discussion.

One thought I had that might be a "first step" towards automating it in the future would be what if that ability to associate page numbers to the chapter list was added so that a user could manual tie the two together? I know this would only keep the progress synced to the chapter level but would be closer than it is able to now.

0 replies

AndryXY · 2024-10-14T22:12:04Z

AndryXY
Oct 14, 2024

Perhaps it makes sense to approach the goal in several steps. In my opinion, a first step in the right direction would be integration with a Calibre database. It is quite easy to read the relevant data from this.
Relevant data would be e.g.

Matching by title (table: books) and displaying an icon to indicate that there is an eBook for the audiobook
If necessary, transfer of metadata from the Calibre database
calibreSchema-example.htm.pdf
(here is a small excerpt from a small analysis script that I recently wrote for a friends calibre database)
Last read position from (table: last_read_positions)
etc.

As long as there is no such integration, it makes no sense in my eyes to think about text to audio transcription. And re-implementing something like calibreWeb (based on a calibre sqlite database) in audiobookshelf from skratch makes no sense too.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync Audiobook and eBook progress #2091

{{title}}

Replies: 4 comments 4 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Sync Audiobook and eBook progress #2091

Replies: 4 comments · 4 replies

advplyr Sep 13, 2023 Maintainer

Replies: 4 comments 4 replies

advplyr Sep 13, 2023
Maintainer