`――は` を読み上げようとすると "500 Internal Server Error" が発生する #17

wakame-tech · 2024-12-10T16:03:12Z

不具合の内容

――は (全角ダッシュ2つ + 「は」) のような文章を読み上げようとすると "500 Internal Server Error" が発生します。
一方 ――あ の場合はエラーとなりません。

現象・ログ

$ poetry run task serve

# 「――あ」
[2024/12/11 00:36:31] INFO:     127.0.0.1:55060 - "POST /audio_query?text=%E2%80%94%E2%80%94%E3%81%82&speaker=888753760 HTTP/1.1" 200 OK

# 「――が」
[2024/12/11 00:38:21] ERROR:    Internal Server Error occurred.
Traceback (most recent call last):
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
    await self.middleware_stack(scope, receive, send)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
    await route.handle(scope, receive, send)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~//Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/voicevox_engine/app/routers/tts_pipeline.py", line 100, in audio_query
    accent_phrases = engine.create_accent_phrases(text, style_id)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/voicevox_engine/tts_pipeline/style_bert_vits2_tts_engine.py", line 424, in create_accent_phrases
    accent_phrases = self._debug_create_accent_phrases(text)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/Documents/AivisSpeech-Engine/voicevox_engine/tts_pipeline/style_bert_vits2_tts_engine.py", line 324, in _debug_create_accent_phrases
    if sep_phonemes_with_joshi_mora_index >= len(sep_phonemes_with_joshi[sep_phonemes_with_joshi_index]):  # fmt: skip
                                                 ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: list index out of range
[2024/12/11 00:38:21] INFO:     127.0.0.1:55732 - "POST /audio_query?text=%E2%80%94%E2%80%94%E3%81%8C&speaker=888753760 HTTP/1.1" 500 Internal Server Error

再現手順

Aivis-Project/AivisSpeech-Engine.git をクローンし, 「開発環境の構築」を行う
poetry run task serve を実行してAPIサーバーを起動する
http://127.0.0.1:10101/docs にアクセスし text: ――は, speaker: 888753760 として POST /audio_query を実行する

期待動作

500 エラーが発生せずに 200 を返す

AivisSpeech Engine のバージョン

1.1.0-dev (1490102)

OS の種類 / バージョン

Windows
macOS
Linux

その他

少し調査したところ、以下の部分で要素数が合っていないことにより起こるようです。下記のようなassertionを入れるとAssertion Errorとなりました。

AivisSpeech-Engine/voicevox_engine/tts_pipeline/style_bert_vits2_tts_engine.py

Line 284 in 1490102

    
           sep_phonemes_with_joshi = _sep_kata_with_joshi2sep_phonemes_with_joshi(sep_kata_with_joshi)  # fmt: skip

assert len(mora_tone_list) == sum(map(len, sep_phonemes_with_joshi))

何例か試しましたが、全角ダッシュ2つ + 助詞?等(「は」「が」「で」)の場合に要素数が一致しなくなるようです。
参考までに調査時のログを掲載させて頂きます。問題解決の一助となれば幸いです。

# StyleBertVITS2TTSEngine._debug_create_accent_phrases("―は青")
mora_tone_list['-', 'ワ', 'ア', 'オ']
sep_kata_with_joshi=['-ワ', 'アオ']
sep_phonemes_with_joshi=[[(None, '-'), ('w', 'a')], [(None, 'a'), (None, 'o')]]

# StyleBertVITS2TTSEngine._debug_create_accent_phrases("――あ青")
mora_tone_list['-', '-', 'ア', 'ア', 'オ']
sep_kata_with_joshi=['--', 'ア', 'アオ']
sep_phonemes_with_joshi=[[(None, '-'), (None, '-')], [(None, 'a')], [(None, 'a'), (None, 'o')]]

# StyleBertVITS2TTSEngine._debug_create_accent_phrases("――は青")
mora_tone_list['-', '-', 'ワ', 'ア', 'オ']
sep_kata_with_joshi=['--ワ', 'アオ']
sep_phonemes_with_joshi=[[(None, '--'), ('w', 'a')], [(None, 'a'), (None, 'o')]]
#  エラー

The text was updated successfully, but these errors were encountered:

tsukumijima · 2024-12-10T16:34:51Z

@wakame-tech
ご報告ありがとうございます！エッジケースすぎて把握できてませんでした…。
この辺りは g2p 周りの仕様から異なる中 VOICEVOX API に可能な限り合わせるために無理に辻褄を合わせている節があり、イレギュラーなテキストが入ると綻びが出やすい箇所になります。
現状他にも修正タスクが山積しておりすぐの対処は難しいですが、近く修正します。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`――は` を読み上げようとすると "500 Internal Server Error" が発生する #17

`――は` を読み上げようとすると "500 Internal Server Error" が発生する #17

wakame-tech commented Dec 10, 2024

tsukumijima commented Dec 10, 2024

――は を読み上げようとすると "500 Internal Server Error" が発生する #17

――は を読み上げようとすると "500 Internal Server Error" が発生する #17

Comments

wakame-tech commented Dec 10, 2024

不具合の内容

現象・ログ

再現手順

期待動作

AivisSpeech Engine のバージョン

OS の種類 / バージョン

その他

tsukumijima commented Dec 10, 2024

`――は` を読み上げようとすると "500 Internal Server Error" が発生する #17

`――は` を読み上げようとすると "500 Internal Server Error" が発生する #17