Support ISO-8859-1 encoding for testcase json files #100

Eorlariel · 2024-10-17T08:40:18Z

Currently umlaute as "ä", "ö" a.s.o are failing in lobster-json if the json file is saved with encoding ISO-8859-1, because lobster-json is trying to read it with utf-8 encoding. lobster-json should not fail in these cases, but should support also other encodings.

This may be a solution on how to detect the encoding:
https://www.powershellgallery.com/packages/poshfunctions/2.2.1.1/content/functions/get-fileencoding.ps1

Acceptance Criterias:
lobster doesn't fail when using "Umlaute" in testcase json files in common encodings like:

UTF8
ISO-8859-1
UTF16 LE
UTF16 BE
Windows 1252

The text was updated successfully, but these errors were encountered:

phiwuu · 2024-10-17T12:41:49Z

One possibility is to use this code snippet to guess the encoding with a certain confidence:

import chardet

with open('example.txt', 'rb') as file:
    result = chardet.detect(file.read())
    encoding = result['encoding']
    confidence = result['confidence']

print(f"The file is encoded in '{encoding}' with confidence {confidence * 100:.2f}%.")

If the confidence is above a threshold, we could take it as granted. We could add a command line flag like --detect-encoding=80 to specify that the encoding shall be detected, and that the confidence level must be at least 80%.

phiwuu added the lobster-json Affects JSON integration label Oct 17, 2024

phiwuu mentioned this issue Oct 17, 2024

Catch UnicodeDecodeError in lobster-json #102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ISO-8859-1 encoding for testcase json files #100

Support ISO-8859-1 encoding for testcase json files #100

Eorlariel commented Oct 17, 2024 •

edited

Loading

phiwuu commented Oct 17, 2024 •

edited

Loading

Support ISO-8859-1 encoding for testcase json files #100

Support ISO-8859-1 encoding for testcase json files #100

Comments

Eorlariel commented Oct 17, 2024 • edited Loading

phiwuu commented Oct 17, 2024 • edited Loading

Eorlariel commented Oct 17, 2024 •

edited

Loading

phiwuu commented Oct 17, 2024 •

edited

Loading