Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用中文翻译后返回的字幕文件格式是分开的,语句不完整就换行了 #345

Open
yxh1065494705yxh opened this issue Dec 27, 2024 · 10 comments

Comments

@yxh1065494705yxh
Copy link

yxh1065494705yxh commented Dec 27, 2024

你好,使用中文翻译后返回的字幕文件格式很奇怪,比如我使用以下内容去翻译:
从李闻骁床上起来
他懒洋洋地把内衣递给我
「我把大门密码改了
以后没事儿别来了 」
我一愣 下意识问
「为什么 」
他勾唇笑道
「她昨天答应当我女朋友了 」

可是返回的字幕变成以下内容:
image

正常应该是返回这种格式更正确:
image

我用了最新版本使用的以下命令:
image

能帮我看看问题是出在哪里吗?

@yxh1065494705yxh
Copy link
Author

请问可以帮忙看一下问题吗?中文基本是要使用以下翻译才算正确,在中国我使用了其他的软件返回的翻译也都是要这样才算正常的。
image

@ongcl03
Copy link

ongcl03 commented Dec 31, 2024

当我使用你的文本,generated的srt是如下:

1
00:00:00,090 --> 00:00:00,290

2
00:00:00,310 --> 00:00:00,660
李闻骁

3
00:00:00,660 --> 00:00:00,840

4
00:00:00,840 --> 00:00:00,960

5
00:00:00,970 --> 00:00:01,250
起来

6
00:00:01,700 --> 00:00:01,850

7
00:00:01,850 --> 00:00:02,240
懒洋洋

8
00:00:02,240 --> 00:00:02,440

9
00:00:02,530 --> 00:00:02,670

10
00:00:02,680 --> 00:00:02,920
内衣

11
00:00:02,940 --> 00:00:03,050

12
00:00:03,050 --> 00:00:03,150

13
00:00:03,150 --> 00:00:03,340

14
00:00:03,790 --> 00:00:04,000

15
00:00:04,000 --> 00:00:04,070

16
00:00:04,080 --> 00:00:04,280
大门

17
00:00:04,300 --> 00:00:04,470
密码

18
00:00:04,480 --> 00:00:04,630

19
00:00:04,630 --> 00:00:04,850

20
00:00:05,300 --> 00:00:05,520
以后

21
00:00:05,530 --> 00:00:05,630

22
00:00:05,630 --> 00:00:05,780
事儿

23
00:00:05,780 --> 00:00:05,890

24
00:00:05,900 --> 00:00:06,030

25
00:00:06,030 --> 00:00:06,220

26
00:00:06,670 --> 00:00:06,810

27
00:00:06,810 --> 00:00:06,920

28
00:00:06,920 --> 00:00:07,100

29
00:00:07,410 --> 00:00:07,820
下意识

30
00:00:07,820 --> 00:00:08,020

31
00:00:08,470 --> 00:00:08,930
为什么

32
00:00:09,390 --> 00:00:09,530

33
00:00:09,540 --> 00:00:09,830
勾唇

34
00:00:09,830 --> 00:00:10,220
笑道

35
00:00:10,670 --> 00:00:10,880

36
00:00:10,880 --> 00:00:11,060
昨天

37
00:00:11,070 --> 00:00:11,430
答应

38
00:00:11,510 --> 00:00:11,620

39
00:00:11,620 --> 00:00:11,680

40
00:00:11,690 --> 00:00:12,010
女朋友

41
00:00:12,010 --> 00:00:12,170

我觉得它不是split by sentences的(应该是by phrase),so你可能需要做一些文本与字幕处理。但是希望之后的更新作者可以把这一块变成句子分行srt

@LiMinghuaLiGan
Copy link

首先这位作者不说中文,其次这个edge-tts做的是语音转文字的功能,和翻译无关,你找错地方了吧

@XerxesLee
Copy link

edge-tss运行得到的每个WordBoundary都是一个字或一个词,且没有标点符号,需要自己开发去比对原文进行各句字幕的组装。

@LiMinghuaLiGan
Copy link

是的,但是不影响语音的拼接组装啊,语音转文字的功能没问题啊,我转出来的就是有正常停顿的语音,和微软edge大声朗读读出来的语音是一样的。

@LiMinghuaLiGan
Copy link

不是已有字幕文件才能做语音转文字吗?我就是这么做的,为什么要再拼接组装一次?

@chnyangjie
Copy link

#346 可以解决这个问题

@ongcl03
Copy link

ongcl03 commented Jan 10, 2025

#346 可以解决这个问题

这个可行,它是split based on 中文的句号 "。" 楼主可能做一些标点符号的处理,来生成相对于的output

@chnyangjie
Copy link

#346 可以解决这个问题

这个可行,它是split based on 中文的句号 "。" 楼主可能做一些标点符号的处理,来生成相对于的output

换行和句号都行,效果也还可以接受

00:00:00,100 --> 00:00:02,262
从李闻骁床上起来

2
00:00:02,212 --> 00:00:05,112
他懒洋洋地把内衣递给我

3
00:00:05,112 --> 00:00:07,237
我把大门密码改了

4
00:00:07,237 --> 00:00:09,237
以后没事儿别来了

5
00:00:09,237 --> 00:00:11,287
我一愣,下意识问

6
00:00:11,287 --> 00:00:12,475
为什么

7
00:00:12,475 --> 00:00:14,312
他勾唇笑道

8
00:00:14,312 --> 00:00:17,312
她昨天答应当我女朋友了

@xhzkp
Copy link

xhzkp commented Jan 11, 2025

@chnyangjie
请问怎么解决的, 没看懂, 能指导一下吗? 谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants