Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add stt output to input prompt area #3269

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

yashschandra
Copy link
Contributor

@yashschandra yashschandra commented Feb 19, 2025

Pull Request Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ”¨ chore
  • πŸ“ docs

Relevant Issues

resolves #3268

What is in this change?

Stopped sending voice command directly as prompt to LLM provider so that user can get a chance to edit or add to it if there was a long pause while speaking.

Additional Information

Before -

before.mov

Now -

after.mov

Additionally fixed a minor typo: endTTSSession -> endSTTSession

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

@timothycarambat
Copy link
Member

FWIW, it used to be this way, then people wanted it to autosubmit, now this PR would change it back

@therealtimex
Copy link

Can we implement two modes:

  • A long press of the microphone icon activates "continuous" mode, which enables autosubmit. The UI should visually indicate that this mode is active.
  • A short press of the microphone icon triggers manual submit.

@yashschandra
Copy link
Contributor Author

yashschandra commented Feb 19, 2025

FWIW, it used to be this way, then people wanted it to autosubmit, now this PR would change it back

I get this, but I feel for a non-native english speaking (like myself) it may be useful

Can we implement two modes:

A long press of the microphone icon activates "continuous" mode, which enables autosubmit. The UI should visually indicate that this mode is active.
A short press of the microphone icon triggers manual submit.

I was thinking having this configuration in Settings but this may also work. @timothycarambat any thoughts on this approach?

@yashschandra
Copy link
Contributor Author

@timothycarambat is there any chance a feature like this (or something similar) can be included?

@timothycarambat
Copy link
Member

@therealtimex That kind of UX is ambiguous and is bound to be non-discoverable. Will have to make this a setting in the Voice and Speech area or elsewhere so that we can stop flip-flipping PRs every couple months on this.

@therealtimex
Copy link

Agreed, setting is always a safe bet.

@yashschandra
Copy link
Contributor Author

added a checkbox for Autosubmit in Voice and Speech settings section -

output.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEAT]: Speech to text confirmation before submission
3 participants