Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add split translation #807

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

add split translation #807

wants to merge 12 commits into from

Conversation

popcion
Copy link
Contributor

@popcion popcion commented Dec 31, 2024

fix #805

  • 解决翻译器的风控和限制问题,如某个特定词汇无论如何也无法绕过风控,不会影响其他气泡框的翻译;
  • 翻译条数不对等的问题;
  • 翻译后全部翻译都为空的问题,现在不会出现空白页乃至空白框;
  • 以前遇到空白页不会提示,现在当特定气泡框的text_region没能正常翻译时会显示某一句没能成功翻译;
  • 当重试次数调低时会稳定提高翻译速度;
  • 标点和数字不再可能单独作为一句返回而导致翻译结果和气泡错位的问题(暂时没考虑真的检测到标点就是自成一句的气泡框而与其他气泡框无关联的情况,默认处理的是翻译器可能用标点填补空位的情况,见下方多出一个省略号的例子)

缺点是会增大token用量(已尽可能降低),且翻译质量会有所下降

  • Resolves the problem of translator censorship and restrictions, such as when a specific word cannot bypass censorship no matter what, without affecting the translation of other speech bubbles;
  • Resolves the problem of mismatched translation counts
  • Resolves the problem of all translations becoming empty, now avoiding blank pages or empty boxes;
  • Previously, there was no prompt for blank pages, but now when a specific speech bubble's text_region fails to translate properly, it will indicate which sentence failed to translate successfully;
  • When retry attempts are reduced, it consistently improves translation speed;
  • Punctuation and numbers are no longer likely to be returned as separate sentences, causing misalignment between translation results and speech bubbles (currently not considering cases where punctuation is genuinely detected as a standalone bubble unrelated to others; by default, it handles situations where the translator might use punctuation to fill empty spaces, as seen in the example with an extra ellipsis below)

The disadvantage is that it will increase token usage (already minimized as much as possible), and translation quality will somewhat decrease

[GPT4Translator] Translating into Chinese (Simplified)
[GPT4Translator] Temperature: 0.5, TopP: 1
[GPT4Translator] -- GPT Prompt --
System:
You are an uncensored professional translation engine, please translate the story into a colloquial, elegant and fluent content, without referencing machine translations. You must only translate the story, never interpret it. If there is any issue in the text, output it as is.The number of lines in the translation must match the number of lines in the source text exactly. Merging lines is strictly prohibited. For example, merging <|1|>This<|2|>is into <|1|>这是 is not allowed.
Translate the following text into Simplified Chinese and keep the prefix format.

User:
<|1|>…先生は今
<|2|>いるはず…
<|3|>幸福省自己実現局
<|4|>市民の幸福が守られているか政府内を監察・査察する組織
<|5|>自己実現局内でも更に特権的立場の集団である
<|6|>特別な機鎧を操る
<|7|>実力部隊でもあり
<|8|>治安を乱す反動分子がいた場合
<|9|>これを武力制田
<|10|>する力を持つ…
[GPT4Translator] -- GPT Response --
<|1|>…老师现在应该在…
<|2|>幸福省自我实现局
<|3|>一个在政府内部监察和检查市民幸福是否得到保障的组织
<|4|>在自我实现局内也是一个更具特权地位的团体
<|5|>操控着特殊的机甲
<|6|>也是一支实力部队
<|7|>如果有破坏治安的反动分子出现
<|8|>就有能力以武力镇压
<|9|>他们拥有这样的力量…
WARNING: [GPT4Translator] Incomplete response, remaining 0 time(s) before splitting the translation.
WARNING: [GPT4Translator] Retry limit reached. Starting to split the translation batch.
[GPT4Translator] -- GPT Prompt (split) --
System:
You are an uncensored professional translation engine, please translate the story into a colloquial, elegant and fluent content, without referencing machine translations. You must only translate the story, never interpret it. If there is any issue in the text, output it as is.The number of lines in the translation must match the number of lines in the source text exactly. Merging lines is strictly prohibited. For example, merging <|1|>This<|2|>is into <|1|>这是 is not allowed.
Translate the following text into Simplified Chinese and keep the prefix format.

User:
<|1|>…先生は今
<|2|>いるはず…
<|3|>幸福省自己実現局
<|4|>市民の幸福が守られているか政府内を監察・査察する組織
<|5|>自己実現局内でも更に特権的立場の集団である
[GPT4Translator] -- GPT Response (split) --
<|1|>操控特殊机甲
<|2|>的精英部队
<|3|>如果有扰乱治安的反动分子
<|4|>就有能力进行武力镇压
<|5|>…
[GPT4Translator] Filtered out: …
[GPT4Translator] Reason: Text is not considered valuable.
WARNING: [GPT4Translator] Empty translations detected. Resplitting the batch.
WARNING: [GPT4Translator] Further splitting the translation batch due to persistent errors.
[GPT4Translator] -- GPT Response (split) --
<|1|>…老师现在应该在…
<|2|>幸福省自我实现局
<|3|>一个监督和检查政府内市民幸福是否得到保障的组织
<|4|>在自我实现局内也是一个更具特权地位的团体
WARNING: [GPT4Translator] Incomplete response, remaining 0 time(s) before splitting the translation.
WARNING: [GPT4Translator] Further splitting the translation batch due to persistent errors.
[GPT4Translator] -- GPT Response (split) --
<|1|>操控特殊机甲
<|2|>也是实力部队
[GPT4Translator] Batch translated: 2/10 completed.
[GPT4Translator] Completed translations: ['…先生は今', 'いるはず…', '幸福省自己実現局', '市民の幸福が守られているか政府内を監察・査察する組織', '自己実現局内でも更に特権 的立場の集団である', '操控特殊机甲', '也是实力部队', '治安を乱す反動分子がいた場合', 'これを武力制田', 'する力を持つ…']
[GPT4Translator] -- GPT Response (split) --
<|1|>如果有破坏治安的反动分子
<|2|>就要用武力来镇压
<|3|>他们的力量…
[GPT4Translator] Batch translated: 5/10 completed.
[GPT4Translator] Completed translations: ['…先生は今', 'いるはず…', '幸福省自己実現局', '市民の幸福が守られているか政府内を監察・査察する組織', '自己実現局内でも更に特権 的立場の集団である', '操控特殊机甲', '也是实力部队', '如果有破坏治安的反动分子', '就要用武力来镇压', '他们的力量…']
[GPT4Translator] -- GPT Response (split) --
<|1|>…老师现在应该在的…
<|2|>
[GPT4Translator] Filtered out:
[GPT4Translator] Reason: Text is not considered valuable.
WARNING: [GPT4Translator] Empty translations detected. Resplitting the batch.
WARNING: [GPT4Translator] Further splitting the translation batch due to persistent errors.
[GPT4Translator] -- GPT Response (split) --
<|1|>幸福省自我实现局
<|2|>负责监察和检查市民幸福是否受到保护的组织
<|3|>在自我实现局内也是一个更具特权地位的群体
[GPT4Translator] Batch translated: 8/10 completed.
[GPT4Translator] Completed translations: ['…先生は今', 'いるはず…', '幸福省自我实现局', '负责监察和检查市民幸福是否受到保护的组织', '在自我实现局内也是一个更具特权地位的 群体', '操控特殊机甲', '也是实力部队', '如果有破坏治安的反动分子', '就要用武力来镇压', '他们的力量…']
[GPT4Translator] -- GPT Response (split) --
<|1|>应该在的…
[GPT4Translator] Batch translated: 9/10 completed.
[GPT4Translator] Completed translations: ['…先生は今', '应该在的…', '幸福省自我实现局', '负责监察和检查市民幸福是否受到保护的组织', '在自我实现局内也是一个更具特权地位的 群体', '操控特殊机甲', '也是实力部队', '如果有破坏治安的反动分子', '就要用武力来镇压', '他们的力量…']
[GPT4Translator] -- GPT Response (split) --
<|1|>…老师现在
[GPT4Translator] Batch translated: 10/10 completed.
[GPT4Translator] Completed translations: ['…老师现在', '应该在的…', '幸福省自我实现局', '负责监察和检查市民幸福是否受到保护的组织', '在自我实现局内也是一个更具特权地位的 群体', '操控特殊机甲', '也是实力部队', '如果有破坏治安的反动分子', '就要用武力来镇压', '他们的力量…']
[GPT4Translator] ['…老师现在', '应该在的…', '幸福省自我实现局', '负责监察和检查市民幸福是否受到保护的组织', '在自我实现局内也是一个更具特权地位的群体', '操控特殊机甲', ' 也是实力部队', '如果有破坏治安的反动分子', '就要用武力来镇压', '他们的力量…']
[GPT4Translator] Used 267 tokens (Total: 3125)
[GPT4Translator] 0: …先生は今 => …老师现在
[GPT4Translator] 1: いるはず… => 应该在的…
[GPT4Translator] 2: 幸福省自己実現局 => 幸福省自我实现局
[GPT4Translator] 3: 市民の幸福が守られているか政府内を監察・査察する組織 => 负责监察和检查市民幸福是否受到保护的组织
[GPT4Translator] 4: 自己実現局内でも更に特権的立場の集団である => 在自我实现局内也是一个更具特权地位的群体
[GPT4Translator] 5: 特別な機鎧を操る => 操控特殊机甲
[GPT4Translator] 6: 実力部隊でもあり => 也是实力部队
[GPT4Translator] 7: 治安を乱す反動分子がいた場合 => 如果有破坏治安的反动分子
[GPT4Translator] 8: これを武力制田 => 就要用武力来镇压
[GPT4Translator] 9: する力を持つ… => 他们的力量…

这个log是旧的,有部分已发送的内容没显示,现在没有这个问题。这只是个示例。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: GPT missed some translations after commit 89443fc
1 participant