Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carriage returns disappear from translations #34

Open
peurKe opened this issue Nov 25, 2024 · 1 comment
Open

Carriage returns disappear from translations #34

peurKe opened this issue Nov 25, 2024 · 1 comment

Comments

@peurKe
Copy link

peurKe commented Nov 25, 2024

Hi,
I'm having a problem managing \r and \n CARRIAGE RETURN characters.
I want to translate sentences containing CARRIAGE RETURN characters that can be located right in the middle of sentences.

I use the following parameters to ask DeepL not to take CARRIAGE RETURN characters into account:

import deepl
translator = deepl.Translator('my_auth_key')

text = "Отпустите рюкзак\r\nи он автоматически\r\nвернется на место"
translation = translator.translate_text(
        text,
        source_lang=deepl.Language.RUSSIAN,
        target_lang=deepl.Language.ENGLISH_AMERICAN,
        formality=deepl.Formality.PREFER_LESS,
        split_sentences=deepl.SplitSentences.NO_NEWLINES,
        # preserve_formatting=True  # Doesn't help, even when uncommented 
    ).text

The meaning of the translations via the API with the DeepL python library are correct but I lose the CARRIAGE RETURN characters in the results (I have also tried with preserve_formatting=True but no better)
I tried with \n instead of \r\n but same results too.

text          : 'Отпустите рюкзак\r\nи он автоматически\r\nвернется на место'
translation   : 'Let go of the backpack and it will automatically return to its place'

text          : 'Отпустите рюкзак\nи он автоматически\nвернется на место'
translation   : 'Let go of the backpack and it will automatically return to its place'

Here's what I expect to happen:

text          : 'Отпустите рюкзак r\nи он автоматически\r\nвернется на место'
translation   : 'Let go of the backpack\r\nand it will automatically\r\nreturn to its place'

text          : 'Отпустите рюкзак\nи он автоматически\nвернется на место'
translation   : 'Let go of the backpack\nand it will automatically\nreturn to its place'

Am I doing something wrong?
Thanks in advance

@peurKe
Copy link
Author

peurKe commented Nov 25, 2024

The workaround I've found to deal with this is to use an xml tag and ignore it with the ignore_tags parameter to get <w><x>LF</x></w> placeholders that I can then replace with CARRIAGE RETURN characters::

translation = translator.translate_text(
        text,
        source_lang=deepl.Language.RUSSIAN,
        target_lang=deepl.Language.ENGLISH_AMERICAN,
        formality=deepl.Formality.PREFER_LESS,
        split_sentences=deepl.SplitSentences.NO_NEWLINES,
        tag_handling='xml',  # Enable xml tags
        ignore_tags=['x'],  # Ignore <x> xml tag
    ).text
text          : 'Отпустите рюкзак<w><x>LF</x></w>и он автоматически<w><x>LF</x></w>вернется на место'
translation   : 'Let go of the backpack<w><x>LF</x></w>and it will automatically<w><x>LF</x></w>back in place'

Then replace all these <w><x>LF</x></w> placeholders with \n to finally get the result I was expecting:

translation   : 'Let go of the backpack\nand it will automatically\nback in place'

But I'd like to avoid using xml tags and let DeepL take care of keeping the \r and \n CARRIAGE RETURN characters.
Is this possible, or am I doing something wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant