Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问R18和OCRmodel进行了怎样的预训练? #14

Open
Zhaohuii-Wang opened this issue Oct 14, 2024 · 2 comments
Open

请问R18和OCRmodel进行了怎样的预训练? #14

Zhaohuii-Wang opened this issue Oct 14, 2024 · 2 comments

Comments

@Zhaohuii-Wang
Copy link

您好,非常感谢您的工作!我想在自己的数据集上重新训练您开源的方法,请问ResNet18的预训练是指High-frequency图像域的适配么,您用了什么方式监督这个训练?OCR的模型也是为了适应VAE encode之后的图像域嘛?

@dailenson
Copy link
Owner

您好~ResNet18我们直接使用了VATr提供的预训练模型,OCR模型是在latent code上预训练的。用vae把图像编码到latet space,然后使用ctc loss优化OCR模型。

@Zhaohuii-Wang
Copy link
Author

感谢回复!如果我没有记错的话,VATr所预训练的R18使用了他们团队自己的Font^{2}(Evaluating Synthetic Pre-Training for Handwriting Processing Tasks)方法,不过放出的模型参数中似乎并没有中文数据的加入。不知道您在训练中文版本的模型时,直接使用VATr提供的参数会不会影响其泛化性?或者说如果能有中文分布的数据训练R18,对风格的提取和文字的生成效果会更好一些,您有相关的实验嘛~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants