Add stdin_bytes to skip encoding #915

Dreamsorcerer · 2023-02-12T15:40:10Z

Fixes #818.

Dreamsorcerer · 2023-02-28T18:26:19Z

@bitprophet Not sure what you pushed, but seems to have added a bunch of unrelated changes to the diff.

Dreamsorcerer · 2023-02-28T18:26:59Z

I should still have the original branch, if you just want me to force push it back?

Dreamsorcerer · 2023-04-08T20:10:59Z

@bitprophet ?

bitprophet · 2023-05-12T18:01:48Z

Yea looks like the typing updates made github lose its mind. if you can rebase and force-push that'd probably be ideal - you'll probably want to make sure you copy over any type hints from main to your branch's diff, if there even are any.

From skimming what looks like the two commits actually relevant: my off the cuff thoughts:

if we go this route we may want to rename the option to stdin_is_bytes=True or maybe decode_stdin=False
however I'm wondering if there's anything cleaner/smarter we can do within the "number of bytes to read" helper instead (per your original 'fix' for this elsewhere)
or, since the "it's binary, please do not do any edge encoding/decoding" approach is arguably more correct, is there anything smarter we can do to autodetect?
- are there approaches other tools use when handling unknown stdin?
- eg "attempt to decode some reasonable first few bytes as if it was $configured_encoding, and if this fails, fallback to assuming binary"?

Dreamsorcerer · 2023-05-12T18:09:08Z

Ah, it looks like I made the changes through Github, not locally. I'll try cherry picking to a new branch.

Dreamsorcerer · 2023-05-12T18:35:17Z

* however I'm wondering if there's anything cleaner/smarter we can do within the "number of bytes to read" helper instead (per your original 'fix' for this elsewhere)

That's a different issue, so not sure that function has anything to do with this one.

* or, since the "it's binary, please do not do any edge encoding/decoding" approach is arguably more correct, is there anything smarter we can do to autodetect?
  
  * are there approaches other tools use when handling unknown stdin?
  * eg "attempt to decode some reasonable first few bytes as if it was $configured_encoding, and if this fails, fallback to assuming binary"?

To be honest, I never fully understood the reason for decoding to a str and back again. So, you'll probably just have to make a decision on this one.

The only related thing I can think of is charset-normalizer, which can guess the encoding used. I'm still not sure I'd use it to try and detect if something is binary or not though. It's always going to be an estimate, so you'll likely still have edge cases that cause problems on occasion.

Dreamsorcerer · 2023-05-12T18:38:30Z

Let me know if you want to continue this way, and I'll sort out the typing. It'll be a little tricky now, as the data being passed around is str | bytes and we're detecting it based on an option, rather than an isinstance() check.

Dreamsorcerer changed the title ~~All stdin_bytes to skip encoding~~ Add stdin_bytes to skip encoding Feb 12, 2023

bitprophet force-pushed the main branch from accc441 to ce93e04 Compare February 16, 2023 20:03

Dreamsorcerer mentioned this pull request May 5, 2023

SFTP file transfer speed regression since 2.5.0 + paramiko/paramiko#1660

Open

bitprophet mentioned this pull request May 12, 2023

Awful performance due to reading bytes 1 at a time #819

Closed

Dreamsorcerer added 3 commits May 12, 2023 19:11

All stdin_bytes to skip encoding

ffa8a63

Update config.py

c35663c

Update runners.py

0dae3b5

Dreamsorcerer force-pushed the patch-1 branch from efa9904 to 0dae3b5 Compare May 12, 2023 18:13

Dreamsorcerer added 2 commits May 12, 2023 19:15

Update config.py

3b0c7d2

Update runners.py

add241e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add stdin_bytes to skip encoding #915

Add stdin_bytes to skip encoding #915

Dreamsorcerer commented Feb 12, 2023

Dreamsorcerer commented Feb 28, 2023

Dreamsorcerer commented Feb 28, 2023

Dreamsorcerer commented Apr 8, 2023

bitprophet commented May 12, 2023 •

edited

Loading

Dreamsorcerer commented May 12, 2023

Dreamsorcerer commented May 12, 2023 •

edited

Loading

Dreamsorcerer commented May 12, 2023

Add stdin_bytes to skip encoding #915

Are you sure you want to change the base?

Add stdin_bytes to skip encoding #915

Conversation

Dreamsorcerer commented Feb 12, 2023

Dreamsorcerer commented Feb 28, 2023

Dreamsorcerer commented Feb 28, 2023

Dreamsorcerer commented Apr 8, 2023

bitprophet commented May 12, 2023 • edited Loading

Dreamsorcerer commented May 12, 2023

Dreamsorcerer commented May 12, 2023 • edited Loading

Dreamsorcerer commented May 12, 2023

bitprophet commented May 12, 2023 •

edited

Loading

Dreamsorcerer commented May 12, 2023 •

edited

Loading