Pyinfra philosophy (and shell quoting/unquoting) #732

bogen85 · 2022-01-01T17:44:11Z

bogen85
Jan 1, 2022

From what I've see of Pyinfra so far, although it is immature compared to other established configuration tools (Ansible, cfengine, chef, puppetc, etc) I find it very refreshing.

Since I'm most familiar with Ansible and use it on POSIX systems I'll primarily use them as a reference here.

Ansible is pushed as being agent-less, as is Pyinfra. But that is not entirely correct.

In common usage, both rely on an SSH server. This is typically safe to assume to be present, as it is one of the most common remote access tools for modern POSIX systems, so that requirement (in my opinion) should always be acceptable.

Ansible relies on python which on most modern POSIX systems is "assumed" to already be installed. However, in the case of alpine and other busybox based small installs, python might be considered "bloatware". So across the board on all POSIX hosts one wants to maintain centrally, requiring python to be on those hosts is something I don't see as preferable.

Pyinfra relies just on /bin/sh. This a safe requirement, as it will be on any system that is considered POSIX, so it is moot to argue whether it should be required.

Sole reliance on /bin/sh is at the core of what I want to discuss, but it may be unavoidable in order to keep Pyinfra lean and simple.

Since the POSIX shell is whitespace token delimited language that does not require strings to be quoted (unless they contain whitespace or other characters the shell language recognizes). Because files may contain spaces or other special characters, there is the need for quoting, and this can get quite complex, especially when outer quotes may be stripped as you pass it from one shell to another, or over ssh.

So quoting is a problem and a necessary evil when relying solely on /bin/sh.

Ansible gets around this because it relies on python. Python scripts are generated by ansible and sent to the remote hosts for execution. Python does not have the same quoting problem as the POSIX shell has.
But as I already mentioned, python is a heavy and not always an acceptable requirement.

Installing python3 my usage on a minimal alpine install went from 25MiB to 74MiB.

There may be a way to simplify and harden the quoting issue.

So, all this to say, sticking with "sh -c" as the base requirement is likely the simplest way to go and moving to some agent or pseudo agent will likely just added unnecessary complexity.

What we need is a reliable generalized quote/unquote mechanism that does not leak as I saw with issue #731 that resulted in erroneous facts. (UPDATE: And there is, and it seems like the python shlex module is used)

EDIT: Removed discussion of language alternatives to python on the remote hosts.

bogen85 · 2022-01-02T15:21:02Z

bogen85
Jan 2, 2022
Author

I guess my frustration is this... Linux has been around for 30+ years now. Why are we still dealing with quoting issues when issuing commands on a Linux system?

run_cmd([path, arg1, arg2, ...]);

(where the args can be filenames, with any number of spaces and quotes and other allowed characters in them...)

Should not have a convoluted in the between steps, even over SSH...

Not pointing at Pyinfra for being at fault here, the situation is what it is (as Pyinfra has to work with what is available)...

2 replies

bogen85 Jan 2, 2022
Author

This is basically the issue I'm referring to ...

bogen85 Jan 3, 2022
Author

So... if a connector based on the current ssh connector were written that required the remote host have something like ARGHSH installed on the remote host(s) (or some other other parameter enforcing it), could all intermediate quoting of command paths and arguments in pyinfra be eliminated (for connection to those hosts)?

Fizzadar · 2022-04-09T17:12:28Z

Fizzadar
Apr 9, 2022
Maintainer

Apologies for taking so long to reply to this! I totally agree with the problems raised here, quoting shell commands is a nightmare and very error prone. The StringCommand has improved this situation, but it's still quoting :)

Perhaps there's some way to bring ARGHSH, or something like it, into pyinfra. Although the issue will then become how to handle this same with in other connectors like @docker that use local shell commands.

0 replies

gchazot · 2023-03-14T19:19:47Z

gchazot
Mar 14, 2023

My 2 cents on the topic, I think that from the point of view of the user, StringCommand solves this problem (and should solve it completely).

By completely I mean that users (writing deploys, facts, operations) should only ever provide python strings corresponding to what they want, and shouldn't have to quote for the transport of the command to the target by pyinfra (1 level of quoting only. If the remote command would need to be quoted in the SSH session, then it's not pyinfra's problem). For example if I want pyinfra to echo the string "\n" (2 characters, \ and n, not a newline) to a file on the remote machine, then I'd pass it something like "echo \\n > myfile" as the command, since that's how I'd type it in my shell.

However, I think this needs to come as well with a refactor of all of the (built-in) Facts & Operations commands that are today created using format strings, especially when the format arguments are user-provided and hence may contain characters that need escaping. These should be migrated to StringCommand. That refactor in itself may be a big chunk of work to do. I haven't done any inventory of what needs migrating, but I've come across a few I now realise will end up causing issues.

Another thing to do would be to talk about this problem in the documentation, especially mentioning that commands are run in /bin/sh, not bash, zsh or ksh that users might be more familiar with.

Then, from the point of view of pyinfra, well... argh! :-) ... as in, maybe ARGHSH or something similar (effectively an agent / a target prerequisite) might be necessary if quoting/unquoting can't solve all situations, or if the level of complexity becomes unacceptable.

0 replies

sfermigier · 2023-11-18T10:45:42Z

sfermigier
Nov 18, 2023

"Installing python3 my usage on a minimal alpine install went from 25MiB to 74MiB."

I wonder if Micropython couldn't be a solution.
https://pkgs.alpinelinux.org/package/edge/testing/x86/micropython
->

Size 	245.12 kB
Installed size 	476 kB

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pyinfra philosophy (and shell quoting/unquoting) #732

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Pyinfra philosophy (and shell quoting/unquoting) #732

bogen85 Jan 1, 2022

Replies: 4 comments · 2 replies

bogen85 Jan 2, 2022 Author

bogen85 Jan 2, 2022 Author

bogen85 Jan 3, 2022 Author

Fizzadar Apr 9, 2022 Maintainer

gchazot Mar 14, 2023

sfermigier Nov 18, 2023

bogen85
Jan 1, 2022

Replies: 4 comments 2 replies

bogen85
Jan 2, 2022
Author

bogen85 Jan 2, 2022
Author

bogen85 Jan 3, 2022
Author

Fizzadar
Apr 9, 2022
Maintainer

gchazot
Mar 14, 2023

sfermigier
Nov 18, 2023