Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable VOSK Transcription #1343

Closed
wants to merge 14 commits into from
Closed

Conversation

janonym1
Copy link

@janonym1 janonym1 commented Jul 5, 2022

this creates 2 (maybe) new ENV variables to allow for easier VOSK integration. Per default, it is assumed GCLOUD is always used and that makes it hard(er) to set up VOSK within docker-jitsi-meet

I still need to add the GCLOUD check and set for the ENV GOOGLE_APPLICATION_CREDENTIALS /config/key.json within the run script (jigasi/rootfs/etc/services.d/jigasi/run) but I am not sure how to best approach this. Maybe lets define an additional variable (ENABLE_GCLOUD_TRANSCRIPTION) and use that?

@janonym1
Copy link
Author

janonym1 commented Jul 5, 2022

I added a new var ENABLE_GCLOUD_TRANSCRIPTION but I am not sure, if my code at jigasi/rootfs/etc/services.d/jigasi/run seting the GCLOUD ENV for the key.json is working as intended

@janonym1
Copy link
Author

janonym1 commented Jul 5, 2022

I also am not sure, if it would be better/nicer if we can chain the check for the GCLOUD creds in the 10-config:

I wanted to AND together the following checks:
if [[ ($ENABLE_TRANSCRIPTIONS -eq 1 || $ENABLE_TRANSCRIPTIONS == "true") && ($ENABLE_VOSK -eq 0 || $ENABLE_VOSK == "false") && ($ENABLE_GCLOUD_TRANSCRIPTION -eq 1 || $ENABLE_GCLOUD_TRANSCRIPTION == "true") ]] but that isnt working as I imagined it. I assume I am misunderstanding some syntaxes here?

jigasi/rootfs/etc/cont-init.d/10-config Outdated Show resolved Hide resolved
jigasi/rootfs/defaults/sip-communicator.properties Outdated Show resolved Hide resolved
jigasi.yml Outdated Show resolved Hide resolved
jigasi/rootfs/etc/services.d/jigasi/run Outdated Show resolved Hide resolved
Copy link
Author

@janonym1 janonym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the requested changes

@janonym1 janonym1 requested a review from saghul July 6, 2022 08:39
@debendraoli
Copy link
Contributor

Is there anything left to do? I would be happy to help.

It would be very useful when this gets merged. Thanks

@janonym1
Copy link
Author

Is there anything left to do? I would be happy to help.

It would be very useful when this gets merged. Thanks

I updated the requested changes (e.g. the variablenaming ENABLE_VOSK_TRANSCRIPTION) but I need to test it out on my testsystem first, which I unfortunately didnt get around to yet. If I botched up something, I also break the GCLOUD transcription ability, which more users seem to use than VOSK :)

Also I am not sure about the logic, that checks if transcriptions are on (but not gcloud) which I will get around to next week hopefully

@charles-zablit
Copy link
Contributor

charles-zablit commented Oct 5, 2022

Hi, is there any update on this, and is there anything I can do to help?

Copy link

@HarHarLinks HarHarLinks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to me one variable was missed when renaming

jigasi/rootfs/etc/cont-init.d/10-config Outdated Show resolved Hide resolved
@rouaidakacem
Copy link

@janonym1 is there an update on this ? Would love to help if necessary.

@HarHarLinks
Copy link

HarHarLinks commented Nov 30, 2022

@debendraoli @charles-zablit @rouaidakacem and everyone else asking to help: could you test this PR and report if it works, and in case you have to make any changes, which ones? Because even with my proposed change, I can't get this to work.

@saiflakhani
Copy link

Been patiently waiting for this, hoping to see this in a release soon! Would love to help/test if there's an update on this

@saghul
Copy link
Member

saghul commented Sep 11, 2023

There is currently a conflict that needs to be fixed.

@krombel krombel mentioned this pull request Dec 4, 2023
@loelkes
Copy link

loelkes commented Jan 30, 2024

Hi @saghul What needs to be done here? I would like to help. It looks like the changes you requested have not been reviewed by you. Or have I missed something?

How can I/we help to get this merged?

@damencho
Copy link
Member

I forgot to add, that you have to have a working SIP setup (server+account) and config for VOSK to work.
Should I just assume it is setup or make a check (e.g. if JIGASI_SIP_URI is set)?

This is not a jigasi issue, this is configuration issue and should be fixed, maybe in this PR?

@saghul
Copy link
Member

saghul commented Mar 27, 2024

I think I fixed that already. If you set ENABLE_TRANSCRIPTIONS the sip config is skipped now.

M4GNV5 added a commit to M4GNV5/docker-jitsi-meet that referenced this pull request Jul 16, 2024
@aslam-t
Copy link

aslam-t commented Jul 17, 2024

Hey fellows, can we please prioritize this somehow? getting into compliance issues for some users.

@damencho
Copy link
Member

Can you resolve the conflicts, please?

@damencho
Copy link
Member

@saghul is there anything missing here?

@zobadaniel
Copy link

Please take a look at commit M4GNV5@d44daf2 which should address the open issues from @saghul - I am interested in getting this merged as well, thus I asked @M4GNV5 for this contribution.

@aslam-t
Copy link

aslam-t commented Jul 29, 2024

Please take a look at commit M4GNV5@d44daf2 which should address the open issues from @saghul - I am interested in getting this merged as well, thus I asked @M4GNV5 for this contribution.

Hey @damencho, these commits have no conflict with jitsi/docker-jitsi-meet master branch. Can your team please check/review?.

Also, I am willing to test this work today, if there's a container image/tag for this. I am trying to create my own image, since its a good amount of learning curve for me, it will take time. @zobadaniel if it is possible for you to provide an image out of M4GNV5@d44daf2 branch, it will accelerate things.

Thanks

@damencho
Copy link
Member

Please take a look at commit M4GNV5@d44daf2 which should address the open issues from @saghul - I am interested in getting this merged as well, thus I asked @M4GNV5 for this contribution.

Hey @damencho, these commits have no conflict with jitsi/docker-jitsi-meet master branch. Can your team please check/review?.

Also, I am willing to test this work today, if there's a container image/tag for this. I am trying to create my own image, since its a good amount of learning curve for me, it will take time. @zobadaniel if it is possible for you to provide an image out of M4GNV5@d44daf2 branch, it will accelerate things.

Thanks

image

@zobadaniel
Copy link

Dear @janonym1 can you please update this PR with M4GNV5@d44daf2 so that the requested changes are visible here?

zobadaniel pushed a commit to ZalozbaDev/docker-jitsi-meet that referenced this pull request Jul 30, 2024
@zobadaniel
Copy link

@aslam-t I have pushed my latest set of jitsi containers to docker, please look here: https://hub.docker.com/repositories/zalozbadev and pick the tag "stable-9584-1_custom-4" for each of the jitsi containers. It includes this PR and #1737 and works for me. I cannot update this PR myself, so the requested changes aren't directly visible. #1737 will be updated with the necessary changes soon, we need to sort out some merge issues there first.

@Marwane2185
Copy link

Hello @zobadaniel , I have tested your docker image for jigasi and did not work!
Also @janonym1 could please merge it to get changes in release.
I want to configure Jigasi with Vosk but did not work yet.

Thanks

@Marwane2185
Copy link

stable-9584-1_custom-4

@zobadaniel , still getting Transcriptions: One or more google cloud environment variables are undefined

@janonym1
Copy link
Author

Hi, sorry for the delays, I think I will have time in the coming weekend or week to take another look at this

@aaronkvanmeerten
Copy link
Member

Hi y'all: sorry for the likely disruptive changes, but I wanted to update everybody on this thread about our plan to move some bits around in order to make the transcriber a separate parallel component. This includes a new transcriber.yaml docker-compose file, and also will split the sip-communicator.properties file into pieces based on the jigasi mode.

All this to say I hope it makes the sort of proposed changes above easier to reason about, but will mean further updates on this PR.

I intend to get some progress on that in the next days, so I will link my PR when it's in any shape to discuss.

@aaronkvanmeerten
Copy link
Member

My new PR splits out the config into different files based on jigasi role, and adds a new docker-compose file:
#1881
It is currently untested and I hope to get more time this week to move it further.

@Marwane2185
Copy link

Hello @aaronkvanmeerten , Thanks for update.
I'm wondering if it is possible to get transcript whenever a new user connect to conference (to provide him summary of what has been discussed previsouly). This will be very helpful.

Thanks

@aaronkvanmeerten
Copy link
Member

Hello @aaronkvanmeerten , Thanks for update.

I'm wondering if it is possible to get transcript whenever a new user connect to conference (to provide him summary of what has been discussed previsouly). This will be very helpful.

Thanks

This sounds like a late arrival summaries to me. I believe that might be possible, depending on how it is stored. This would be a customization from the iframe API or via custom client code however, it's not something we support out of the box.

@Ali-Aljufairi
Copy link

I have question is it possible to transcibe multiple lanaguages at the same time using gclound or vsok ?

@aaronkvanmeerten
Copy link
Member

I have question is it possible to transcibe multiple lanaguages at the same time using gclound or vsok ?

I feel this thread is getting out of context. Please direct such questions to https://community.jitsi.org/
TL;DR; not so far as I know.

@Ali-Aljufairi
Copy link

I have question is it possible to transcibe multiple lanaguages at the same time using gclound or vsok ?

I feel this thread is getting out of context. Please direct such questions to https://community.jitsi.org/ TL;DR; not so far as I know.

This Pr is like 2 year old so its normall thanks for guidense I will ask there

@oerodriguezn
Copy link

Hi y'all: sorry for the likely disruptive changes, but I wanted to update everybody on this thread about our plan to move some bits around in order to make the transcriber a separate parallel component. This includes a new transcriber.yaml docker-compose file, and also will split the sip-communicator.properties file into pieces based on the jigasi mode.

All this to say I hope it makes the sort of proposed changes above easier to reason about, but will mean further updates on this PR.

I intend to get some progress on that in the next days, so I will link my PR when it's in any shape to discuss.

Hi @aaronkvanmeerten , great job. I believe your changes are already in master branch, Do you have any estimated release date? Would you include this issue about being able to use, for example Vosk, instead of Google, without the need of configure google env vars?

Do you have any documentation about the features/bug fixes in this version? I'm about to add transcription to my JITSI, but maybe it is worth to wait for your release.

@aaronkvanmeerten
Copy link
Member

aaronkvanmeerten commented Sep 25, 2024

[snip]
I intend to get some progress on that in the next days, so I will link my PR when it's in any shape to discuss.

Hi @aaronkvanmeerten , great job. I believe your changes are already in master branch, Do you have any estimated release date? Would you include this issue about being able to use, for example Vosk, instead of Google, without the need of configure google env vars?

Do you have any documentation about the features/bug fixes in this version? I'm about to add transcription to my JITSI, but maybe it is worth to wait for your release.

We should have a new stable release with the changes already in master this week.

Here's one more PR that needs to be merged to fully support Vosk.
#1915

Once merged/marked stable then you'd just set:

JIGASI_TRANSCRIBER_CUSTOM_SERVICE=org.jitsi.jigasi.transcription.VoskTranscriptionService
JIGASI_TRANSCRIBER_VOSK_URL=wss://your-server.yourdomain.tld/path

Until the time it's merged, you can provide a custom jigasi config in /config/custom-sip-communicator.properties with the appropriate snippet it will be appended to the bottom of the file:
org.jitsi.jigasi.transcription.vosk.websocket_url=wss://your-url.domain.tld/path

Edit: it's merged now, so once we tag the latest stable you should be able to do the environment variable pieces above.

@damencho
Copy link
Member

Vosk config has been added recently.

@damencho damencho closed this Oct 11, 2024
@celevra
Copy link

celevra commented Oct 23, 2024

the vars are working and so vosk but it still tries to contact google

@aaronkvanmeerten
Copy link
Member

Can you show any logs? I haven't tested specifically for vosk only as I don't have an endpoint for taht one. I'm curious to see your jigasi configuration as well as any errors in startup?
specifically this variable SHOULD be setting properly the default transcriber: JIGASI_TRANSCRIBER_CUSTOM_SERVICE
should be setting org.jitsi.jigasi.transcription.customService= in the sip-communicator.properties file

@celevra
Copy link

celevra commented Oct 23, 2024

jitsi-transcriber-1  | 2024-10-23 15:46:57.888 SEVERE: [163] VoskTranscriptionService$VoskWebsocketStreamingSession.onError#306: Error while streaming audio data to transcription service
jitsi-transcriber-1  | com.google.cloud.translate.TranslateException: Method doesn't allow unregistered callers (callers without established identity). Please use API Key or other form of API consumer identity to call this API.
jitsi-transcriber-1  |  at com.google.cloud.translate.spi.v2.HttpTranslateRpc.translate(HttpTranslateRpc.java:57)
jitsi-transcriber-1  |  at com.google.cloud.translate.spi.v2.HttpTranslateRpc.translate(HttpTranslateRpc.java:126)
jitsi-transcriber-1  |  at com.google.cloud.translate.TranslateImpl$4.call(TranslateImpl.java:124)
jitsi-transcriber-1  |  at com.google.cloud.translate.TranslateImpl$4.call(TranslateImpl.java:121)
jitsi-transcriber-1  |  at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:103)
jitsi-transcriber-1  |  at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
jitsi-transcriber-1  |  at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
jitsi-transcriber-1  |  at com.google.cloud.translate.TranslateImpl.translate(TranslateImpl.java:120)
jitsi-transcriber-1  |  at com.google.cloud.translate.TranslateImpl.translate(TranslateImpl.java:138)
jitsi-transcriber-1  |  at org.jitsi.jigasi.transcription.GoogleCloudTranslationService.translate(GoogleCloudTranslationService.java:60)
jitsi-transcriber-1  |  at org.jitsi.jigasi.transcription.TranslationManager.getTranslations(TranslationManager.java:149)
jitsi-transcriber-1  |  at org.jitsi.jigasi.transcription.TranslationManager.notify(TranslationManager.java:177)
jitsi-transcriber-1  |  at org.jitsi.jigasi.transcription.Transcriber.notify(Transcriber.java:871)
jitsi-transcriber-1  |  at org.jitsi.jigasi.transcription.Participant.notify(Participant.java:585)
jitsi-transcriber-1  |  at org.jitsi.jigasi.transcription.VoskTranscriptionService$VoskWebsocketStreamingSession.onMessage(VoskTranscriptionService.java:283)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.messages.StringMessageSink.accept(StringMessageSink.java:54)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.acceptMessage(JettyWebSocketFrameHandler.java:351)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onTextFrame(JettyWebSocketFrameHandler.java:437)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onFrame(JettyWebSocketFrameHandler.java:241)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession$IncomingAdaptor.lambda$onFrame$1(WebSocketCoreSession.java:671)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.handle(WebSocketCoreSession.java:118)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession$IncomingAdaptor.onFrame(WebSocketCoreSession.java:671)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.ExtensionStack.onFrame(ExtensionStack.java:120)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.onFrame(WebSocketCoreSession.java:481)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFrame(WebSocketConnection.java:271)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:464)
jitsi-transcriber-1  |  at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:349)
jitsi-transcriber-1  |  at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
jitsi-transcriber-1  |  at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
jitsi-transcriber-1  |  at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
jitsi-transcriber-1  |  at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
jitsi-transcriber-1  |  at java.base/java.lang.Thread.run(Thread.java:840)
jitsi-transcriber-1  | Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
jitsi-transcriber-1  | GET https://translation.googleapis.com/language/translate/v2?model=nmt&q=bin%20an&source=de&target=en
jitsi-transcriber-1  | {
jitsi-transcriber-1  |   "code" : 403,
jitsi-transcriber-1  |   "errors" : [ {
jitsi-transcriber-1  |     "domain" : "global",
jitsi-transcriber-1  |     "message" : "Method doesn't allow unregistered callers (callers without established identity). Please use API Key or other form of API consumer identity to call this API.",
jitsi-transcriber-1  |     "reason" : "forbidden"
jitsi-transcriber-1  |   } ],
jitsi-transcriber-1  |   "message" : "Method doesn't allow unregistered callers (callers without established identity). Please use API Key or other form of API consumer identity to call this API.",
jitsi-transcriber-1  |   "status" : "PERMISSION_DENIED"
jitsi-transcriber-1  | }
jitsi-transcriber-1  |  at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
jitsi-transcriber-1  |  at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
jitsi-transcriber-1  |  at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
jitsi-transcriber-1  |  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:428)
jitsi-transcriber-1  |  at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
jitsi-transcriber-1  |  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
jitsi-transcriber-1  |  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
jitsi-transcriber-1  |  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
jitsi-transcriber-1  |  at com.google.cloud.translate.spi.v2.HttpTranslateRpc.translate(HttpTranslateRpc.java:112)
jitsi-transcriber-1  |  ... 37 more

this is my .env config

# Enable Jigasi transcription
ENABLE_TRANSCRIPTIONS=1
# Jigasi will post an url to the chat with transcription file [default: false]
JIGASI_TRANSCRIBER_ADVERTISE_URL=true

JIGASI_TRANSCRIBER_CUSTOM_SERVICE=org.jitsi.jigasi.transcription.VoskTranscriptionService
JIGASI_TRANSCRIBER_VOSK_URL=ws://jitsi-vosk:2700

# Credentials for connect to Cloud Google API from Jigasi
# Please read https://cloud.google.com/text-to-speech/docs/quickstart-protocol
# section "Before you begin" paragraph 1 to 5
# Copy the values from the json to the related env vars
#GC_PROJECT_ID=
#GC_PRIVATE_KEY_ID=
#GC_PRIVATE_KEY=
#GC_CLIENT_EMAIL=
#GC_CLIENT_ID=
#GC_CLIENT_CERT_URL=

# Enable recording
ENABLE_RECORDING=1

@aaronkvanmeerten
Copy link
Member

It appears we need a new flag: JIGASI_TRANSCRIBER_ENABLE_TRANSLATION

#1953

Once this is merged and pushed into stable you can set that:
JIGASI_TRANSCRIBER_ENABLE_TRANSLATION=0

Until the time it's merged, you can provide a custom jigasi config in /config/custom-sip-communicator.properties with the appropriate snippet it will be appended to the bottom of the file:
org.jitsi.jigasi.transcription.ENABLE_TRANSLATION=false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.