Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why pipes sometimes get "stuck": buffering #958

Open
1 task
ShellLM opened this issue Dec 11, 2024 · 1 comment
Open
1 task

Why pipes sometimes get "stuck": buffering #958

ShellLM opened this issue Dec 11, 2024 · 1 comment
Labels
CLI-UX Command Line Interface user experience and best practices linux Linux notes tools links shell-tools Tools and utilities for shell scripting and command line operations TIL Short notes or tips on coding, linux, llms, ml, etc

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Dec 11, 2024

Why pipes sometimes get "stuck": buffering

Snippet

Here's a niche terminal problem that has bothered me for years but that I never really understood until a few weeks ago. Let's say you're running this command to watch for some specific output in a log file:

tail -f /some/log/file | grep thing1 | grep thing2

If log lines are being added to the file relatively slowly, the result I'd see is… nothing! It doesn't matter if there were matches in the log file or not, there just wouldn't be any output.

Why this happens: buffering

The reason why "pipes get stuck" sometimes is that it's VERY common for programs to buffer their output before writing it to a pipe or file. This is for performance reasons: writing all output immediately as soon as you can uses more system calls, so it's more efficient to save up data until you have 8KB or so of data to write (or until the program exits) and THEN write it to the pipe.

In this example:

tail -f /some/log/file | grep thing1 | grep thing2

the problem is that grep thing1 is saving up all of its matches until it has 8KB of data to write, which might literally never happen.

Programs don't buffer when writing to a terminal, but they do buffer when writing to a pipe. This is because the way grep (and many other programs) decides to buffer its output depends on whether it's writing to a terminal or not.

Commands that buffer & commands that don't

Some commands that don't buffer their output:

  • tail
  • cat
  • tee

But most other commands will buffer their output when writing to a pipe, including:

  • grep (--line-buffered)
  • sed (-u)
  • awk (there's a fflush() function)
  • tcpdump (-l)
  • jq (-u)
  • tr (-u)
  • cut (can't disable buffering)

Programming languages where the default "print" statement buffers

The default print statement will buffer output when writing to a pipe in languages like:

  • C (disable with setvbuf)
  • Python (disable with python -u, PYTHONUNBUFFERED=1, sys.stdout.reconfigure(line_buffering=False), or print(x, flush=True))
  • Ruby (STDOUT.sync = true)
  • Perl ($| = 1)

Solutions to avoid buffering

  1. Run a program that finishes quickly, e.g. cat /some/log/file | grep thing1 | grep thing2 | tail
  2. Use the "line buffer" flag for grep: tail -f /some/log/file | grep --line-buffered thing1 | grep thing2
  3. Use awk instead of multiple greps: tail -f /some/log/file | awk '/thing1/ && /thing2/'
  4. Use stdbuf to disable libc buffering: tail -f /some/log/file | stdbuf -o0 grep thing1 | grep thing2
  5. Use unbuffer to force the program's output to be a TTY: tail -f /some/log/file | unbuffer grep thing1 | grep thing2

Task

Take the user input and reformat it according to the instructions.

Suggested Labels

  • terminal
  • buffering
  • pipes
  • performance
  • programming

Suggested labels

None

@ShellLM ShellLM added CLI-UX Command Line Interface user experience and best practices linux Linux notes tools links shell-tools Tools and utilities for shell scripting and command line operations TIL Short notes or tips on coding, linux, llms, ml, etc labels Dec 11, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Dec 11, 2024

Related content

#746 similarity score: 0.88
#741 similarity score: 0.87
#932 similarity score: 0.85
#730 similarity score: 0.84
#924 similarity score: 0.84
#24 similarity score: 0.83

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI-UX Command Line Interface user experience and best practices linux Linux notes tools links shell-tools Tools and utilities for shell scripting and command line operations TIL Short notes or tips on coding, linux, llms, ml, etc
Projects
None yet
Development

No branches or pull requests

1 participant