Skip to content
This repository has been archived by the owner on Jul 6, 2023. It is now read-only.

executing a file with utf characters messes up the encoding #213

Open
mbouclas opened this issue Mar 24, 2020 · 5 comments
Open

executing a file with utf characters messes up the encoding #213

mbouclas opened this issue Mar 24, 2020 · 5 comments

Comments

@mbouclas
Copy link

The file is utf-8 encoded and when executed all characters turn into ???. Same goes when pasting anything utf-8 encoded in the shell, like CREATE (t:Test {name: "ασ"});. Is there a way to get around this? The import files is too large to paste in the browser

@khelkun
Copy link

khelkun commented Mar 29, 2021

Can someone help on this please?
From Windows operating system I run cypher-shell.bat 4.2.3 with a file input which have a sequence of node CREATE cyphers. This file is UTF8 encoded and contains non ASCII characters for some node properties. However the property values in the neo4j database are wrongly decoded, for example like this Name property:

{
   "nid":751632,
   "Name":"Données d''identification",
}

While the Cypher in the UTF8 file was:

CREATE (n {
   nid:751632,
   Name:"Données d\'identification"
})
RETURN n;

The issue does not occur whan running this cypher from the Neo4j browser.
I'm not 100% cypher-shell is responsible for this but it seems so.

@khelkun
Copy link

khelkun commented Mar 31, 2021

I found a workaround with notepad++ "Encoding > Convert to ANSI" on my cypher input file. Then cypher-shell.bat correctly process my file and I can see the "Données" string in my neo4j browser.

I've been able to achieve this conversion with GnuWin32 iconv too:
iconv.exe" -f utf-8 -t iso-8859-1//IGNORE G:\cypher.utf8.txt > G:\cypher.iso-8859-1.txt
This cypher.iso-8859-1.txt is properly read by cypher-shell.bat

The problem is more likely coming from the running context of cypher-shell.bat e.g the windows command line cmd.exe.

@ooker777
Copy link

The problem is more likely coming from the running context of cypher-shell.bat e.g the windows command line cmd.exe.

I'm using PowerShell 7 and Windows Terminal, both of them can handle Unicode characters properly. But in cypher shell they are not

@calltherain
Copy link

Have the exact same issue, and I'm using this approach:

$ export LC_ALL=C.UTF-8
$ export LANG=C.UTF-8
$ cypher-shell -u *** -d *** -f cypher.file

@tistre
Copy link

tistre commented Mar 1, 2023

I have the same problem. My workaround is to modify my copy of cypher-shell, adding -Dfile.encoding="UTF8":

root@6b3d35510b5d:/var/lib/neo4j# diff -C 7 /var/lib/neo4j/bin/cypher-shell.bak /var/lib/neo4j/bin/cypher-shell
*** /var/lib/neo4j/bin/cypher-shell.bak Wed Mar  1 16:13:37 2023
--- /var/lib/neo4j/bin/cypher-shell     Wed Mar  1 16:13:45 2023
***************
*** 120,128 ****
--- 120,129 ----
  exec "$JAVACMD" $JAVA_OPTS  \
    -classpath "$CLASSPATH" \
    -Dapp.name="cypher-shell" \
    -Dapp.pid="$$" \
    -Dapp.repo="$REPO" \
    -Dapp.home="$BASEDIR" \
    -Dbasedir="$BASEDIR" \
+   -Dfile.encoding="UTF8" \
    org.neo4j.shell.startup.CypherShellBoot \
    "$@"

Is there a better solution? Or can this option be added to the standard distribution of cypher-shell?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants