Skip to content

Commit

Permalink
documentation, fixes, high quality openww models
Browse files Browse the repository at this point in the history
  • Loading branch information
mikejgray committed Dec 20, 2023
1 parent 05e45f1 commit 7d4f860
Show file tree
Hide file tree
Showing 7 changed files with 99 additions and 42 deletions.
11 changes: 6 additions & 5 deletions .github/workflows/publish_test_websat_build.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
name: Publish Docker Containers
on:
registry:
type: string
default: ghcr.io
pull_request:
branches:
- dev
- main

env:
REGISTRY: ghcr.io
Expand All @@ -12,7 +13,7 @@ jobs:
build_and_publish_docker:
runs-on: ubuntu-latest
outputs:
version: ${{ steps.version.version }}
version: "${{ steps.version.version }}"
permissions:
contents: read
packages: write
Expand All @@ -33,7 +34,7 @@ jobs:
id: version
run: |
VERSION=$(sed "s/a/-a./" <<< $(python setup.py --version))
echo ::set-output name=version::${VERSION}
echo "version=${VERSION}" >> $GITHUB_OUTPUT
env:
image_name: ${{ env.IMAGE_NAME }}

Expand Down
75 changes: 69 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
# Neon Iris

Neon Iris (Interactive Relay for Intelligence Systems) provides tools for
interacting with Neon systems remotely, via [MQ](https://github.com/NeonGeckoCom/chat_api_mq_proxy).

Install the Iris Python package with: `pip install neon-iris`
The `iris` entrypoint is available to interact with a bus via CLI. Help is available via `iris --help`.

## Configuration
Configuration files can be specified via environment variables. By default,
`Iris` will read configuration from `~/.config/neon/diana.yaml` where

Configuration files can be specified via environment variables. By default,
`Iris` will read configuration from `~/.config/neon/diana.yaml` where
`XDG_CONFIG_HOME` is set to the default `~/.config`.
More information about configuration handling can be found
More information about configuration handling can be found
[in the docs](https://neongeckocom.github.io/neon-docs/quick_reference/configuration/).
> *Note:* The neon-iris Docker image uses `neon.yaml` by default because the

> _Note:_ The neon-iris Docker image uses `neon.yaml` by default because the
> `iris` web UI is often deployed with neon-core.
A default configuration might look like:

```yaml
MQ:
server: neonaialpha.com
Expand All @@ -34,22 +38,81 @@ iris:
```
### Language Support
For Neon Core deployments that support language support queries via MQ, `languages`
may be removed and `enable_lang_api: True` added to configuration. This will use
the reported STT/TTS supported languages in place of any `iris` configuration.

## Interfacing with a Diana installation

The `iris` CLI includes utilities for interacting with a `Diana` backend. Use
`iris --help` to get a current list of available commands.

### `iris start-listener`
This will start a local wake word recognizer and use a remote Neon

This will start a local wake word recognizer and use a remote Neon
instance connected to MQ for processing audio and providing responses.

### `iris start-gradio`

This will start a local webserver and serve a Gradio UI to interact with a Neon
instance connected to MQ.

### `iris start-client`
This starts a CLI client for typing inputs and receiving responses from a Neon

This starts a CLI client for typing inputs and receiving responses from a Neon
instance connected via MQ.

### `iris start-websat`

This starts a local webserver and serves a web UI for interacting with a Neon
instance connected to MQ.

## websat

### Configuration

The `websat` web UI is a simple web UI for interacting with a Neon instance. It
accepts special configuration items prefixed with `webui_` to customize the UI.

| parameter | description | default |
| ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | ---------------------- |
| webui_description | The header text for the web UI | Chat with Neon |
| webui_title | The title text for the web UI in the browser | Neon AI |
| webui_input_placeholder | The placeholder text for the input box | Ask me something |
| webui_ws_url | The websocket URL to connect to, which must be accessible from the browser you're running in. Note that the default will usually fail. | ws://localhost:8000/ws |

Example configuration:

```yaml
iris:
webui_title: Neon AI
webui_description: Chat with Neon
webui_input_placeholder: Ask me something
webui_ws_url: wss://neonaialpha.com:8000/ws
```

### Customization

The websat web UI reads in the following items from `neon_iris/static`:

- `error.mp3` - Used for error responses
- `wake.mp3` - Used for wake word responses
- `favicon.ico` - The favicon for the web UI
- `logo.webp` - The logo for the web UI

To customize these items, you can replace them in the `neon_iris/static` folder.

### Websocket endpoint

The websat web UI uses a websocket to communicate with OpenWakeWord, which can
load `.tflite` or `.onnx` models. The websocket endpoint is `/ws`, but since it
is served with FastAPI, it also supports `wss` for secure connections. To
use `wss`, you must provide a certificate and key file.

### Chat history

The websat web UI stores chat history in the browser's [local storage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage).
This allows chat history to persist between browser sessions. However, it also
means that if you clear your browser's local storage, you will lose your chat
history. This is a feature, not a bug.
34 changes: 18 additions & 16 deletions neon_iris/static/scripts/audio.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,17 @@ const AudioHandler = (() => {
let sampleRate;
let isRecording = false;

// Ensure the getUserMedia is correctly referenced
const getUserMedia = navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia ||
navigator.msGetUserMedia;
// Ensure the getUserMedia is correctly referenced
const getUserMedia =
navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia ||
navigator.msGetUserMedia;

const startAudio = () => {
if (getUserMedia) {
getUserMedia.call(navigator,
getUserMedia.call(
navigator,
{ audio: true },
(stream) => {
audioStream = stream;
Expand Down Expand Up @@ -57,12 +59,12 @@ const AudioHandler = (() => {
if (recorder) {
recorder.disconnect();
volume.disconnect();
// Disconnecting the audio context might not be necessary; depends on your use case.
// Disconnecting the audio context might not be necessary
// audioContext.close();
}
if (audioStream) {
const tracks = audioStream.getTracks();
tracks.forEach(track => track.stop());
tracks.forEach((track) => track.stop());
}
}
};
Expand All @@ -82,7 +84,7 @@ const AudioHandler = (() => {
let l = buffer.length;
let buf = new Int16Array(l);
while (l--) {
buf[l] = Math.min(1, buffer[l]) * 0x7FFF;
buf[l] = Math.min(1, buffer[l]) * 0x7fff;
}
return buf.buffer;
};
Expand All @@ -93,16 +95,16 @@ const AudioHandler = (() => {
};
})();

const startButton = document.getElementById('startButton');
startButton.addEventListener('click', function() {
AudioHandler.toggle(); // Call the toggle method
const startButton = document.getElementById("startButton");
startButton.addEventListener("click", function () {
AudioHandler.toggle();

// Update the button's text and class based on the recording state
if (AudioHandler.isRecording()) {
startButton.classList.add('listening');
startButton.textContent = 'Listening...';
startButton.classList.add("listening");
startButton.textContent = "Listening...";
} else {
startButton.classList.remove('listening');
startButton.textContent = 'Start Listening';
startButton.classList.remove("listening");
startButton.textContent = "Start Listening";
}
});
Binary file not shown.
Binary file not shown.
13 changes: 2 additions & 11 deletions neon_iris/web_sat_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def __init__(self, lang: str = None):
# OpenWW
# TODO: Allow for arbitrary models, or pre-existing OpenWW models
self.oww_model = Model(
wakeword_models=["neon_iris/wakeword_models/hey_neon/hey_neon.tflite"],
wakeword_models=["neon_iris/wakeword_models/hey_neon/hey_neon_high.tflite"],
inference_framework="tflite",
)
# FastAPI
Expand Down Expand Up @@ -83,26 +83,17 @@ def handle_klat_response(self, message: Message):
"""
LOG.debug(f"gradio context={message.context['gradio']}")
resp_data = message.data["responses"]
files = []
sentences = []
session = message.context["gradio"]["session"]
for _, response in resp_data.items(): # lang, response
sentences.append(response.get("sentence"))
if response.get("audio"):
for _, data in response["audio"].items():
# filepath = "/".join(
# [self.audio_cache_dir] + response[gender].split("/")[-4:]
# )
# TODO: This only plays the most recent, so it doesn't
# support multiple languages or multi-utterance responses
self._current_tts[session] = data
# files.append(filepath)
# if not isfile(filepath):
# decode_base64_string_to_file(data, filepath)
self._response = "\n".join(sentences)
self._await_response.set()

def send_audio(
def send_audio( # pylint: disable=arguments-renamed
self,
audio_b64_string: str,
lang: str = "en-us",
Expand Down
8 changes: 4 additions & 4 deletions requirements/web_sat.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
fastapi
uvicorn
fastapi~=0.104.1
uvicorn~=0.24.0.post1
aiohttp~=3.8.6
resampy~=0.4.2
openwakeword~=0.5.1
tflite~=2.10.0
onnxruntime
jinja2
onnxruntime~=1.16.3
jinja2~=3.1.2

0 comments on commit 7d4f860

Please sign in to comment.