Add an option to show first N lines (--head=N) #359

alexm · 2022-06-14T09:48:29Z

A common use of ack is filtering the output of commands like ps, kubectl, etc. that show column names on the first line:

$ ps -a
    PID TTY          TIME CMD
   8079 tty2     00:02:53 Xorg
   8256 tty2     00:00:00 gnome-session-b
  22672 pts/0    00:00:00 ps

Sometimes it would be useful to show the column names (i.e. usually the first line is enough) when the output meaning is not obvious. This can be easily achieved by adding a match for the first line, e.g.:

$ ps -a | ack '^\s*PID|gnome'
    PID TTY          TIME CMD
   8256 tty2     00:00:00 gnome-session-b

However, an option to always show the first line would be very useful, e.g.

ps -a | ack --column-names gnome
ps -a | ack --first-line gnome

The text was updated successfully, but these errors were encountered:

petdance · 2022-06-14T23:39:34Z

~~This falls outside of what ack is intended to do. ack doesn't know anything about the text it's searching, and parsing that would certainly be the case.~~ Ignore this, I thought you were talking about parsing the heading line.

For something like this, why even use ack vs. grep? There are no advantages to ack over grep, other than the Perl regexes.

If all you really want is the headings from ps, you could call ps twice and use head to get the headings the first time, and then grep your results from the second.

$ ps -a | head -n1 ; ps -a | grep cronolog
  PID TTY          TIME CMD
 8239 pts/14   00:00:00 cronolog
23339 pts/3    00:00:00 cronolog
31527 pts/2    00:00:00 cronolog

alexm · 2022-06-16T16:26:52Z

I'm not sure why ack should need to know anything about the text to always show the first line, i.e. print the first line (whatever it contains) and then proceed to filter from line 2 onwards.

Since ack has so many more features than grep, I thought that this could be nice to have too, but I understand your reluctance to add this feature.

Cheers, and thanks for your excellent work!

petdance · 2022-06-16T16:29:57Z

Hmm, I think I've been mixing up two things. Let's explore this idea.

petdance · 2022-06-16T17:02:16Z

So say we have a call like

ack foo --head=1

That means that ack will show the first line that matches, plus any lines that match /foo/. Questions:

Do we should the first N lines only if there is a match in the file?
Do we show matches in the header lines? If "foo" shows up in the header lines, what do we do?
Do we do the --head=1 rule on every file? The scenario you're describing is good for piped-in data, but ack does more than that.

petdance · 2022-06-16T17:14:48Z

Also, back to my original question: Why use ack at all for this? Why not do:

ps -a | head -n1 ; ps -a | grep cronolog

n1vux · 2022-06-16T17:23:34Z

Why use ack at all for this?

because i want Perl RE not egrep RE ?

( I use ps -a | perl -nlE 'say if 1..1 or /cronolog/ for such cases but that's me.)

petdance · 2022-06-16T18:12:29Z

because i want Perl RE not egrep RE ?

I know why you might, but I was asking @alexm.

alexm · 2022-06-16T19:27:08Z

Do we show the first N lines only if there is a match in the file?

Yes, only if there's a match in the file.

Do we show matches in the header lines? If "foo" shows up in the header lines, what do we do?

Nothing, --head=1 would mean that the first N lines are skipped from filtering as would happen with ps -a | head -n1 ; ps -a | ack regex, thus making --head=1 a convenient shortcut of that snippet.

Do we do the --head=1 rule on every file? The scenario you're describing is good for piped-in data, but ack does more than that.

Yes. For instance (see the notes below):

$ ps -a > ps.txt
$ kubectl get pods > kubectl.txt
$ ack --head=1 foobar *.txt
ps.txt
------
    PID TTY          TIME CMD
   6727 tty2     00:00:51 foobar

kubectl.txt
-----------
NAME                              READY   STATUS      RESTARTS   AGE
foobar-backend-7d8965b74b-wx76t   1/1     Running     0          2d20h

Notes:

the line number is not shown to keep the column layout aligned with the header
there is an additional line below the filename to show more clearly where the output starts, but it's only a suggestion

Also, back to my original question: Why use ack at all for this?

I use ack more often that grep, even grep has now the -P option that let's you use Perl regexes. I prefer ack for several reasons:

Being able to put a .ackrc file in my projects (and home) to ignore node_modules, vendor, dist, etc.
Using predefined or custom file types in filtering
It feels faster than grep
Does recursive search by default
Ignoring certain files and directories, like .git, etc.
Is shorter to type
And probably many more that I can't remember now 😉

petdance · 2022-06-16T19:37:54Z

I get all those reasons for using ack over grep (I've preached them :-)) I was just meaning in the case of filtering output from ps.

alexm · 2022-06-16T19:47:49Z

In ps -a | head -n1 ; ps -a | grep expr the output from both ps commands could theoretically be different.

Then, I imagined myself adding a new option to ack (which would be easier for me than doing it for grep). I'm even willing to send a pull request if I find enough round tuits.

Other than these, I can't say there's any other particular reason to prefer ack over grep to filter ps output.

petdance · 2022-06-16T19:50:02Z

The lines of dashes you've shown would be a new feature, right?

Right now if you don't want the grouping/line numbers, you have the -h. We don't yet have an option to just turn off line numbers, although it's a feature request that has been around a while and I wouldn't be opposed to. See #142

I wouldn't want --head=1 to change any behaviors on how things get output. If you were doing an ack of multiple files and you wanted --head=1, you would probably also have to have a --no-line-number argument as well.

Do we show matches in the header lines? If "foo" shows up in the header lines, what do we do?

Nothing, --head=1 would mean that the first N lines are skipped from filtering

When you say "skipped from filtering" here, do you mean "skipped from being searched"?

If so, then I'm not sure I'm OK with that, but will think. If not, please say more about what you mean by "filtering"?

(And thanks for taking the time to work through these questions. This is the tough part of figuring out features.)

petdance · 2022-06-16T19:56:47Z

Some things I'm thinking about: You're talking about using this to show the first line of a stream because you know it's a command like ps and you want the headings. I see the use of this being broader than just that.

For example, I might go acking through a tree of source and do

ack salestax src/ --head=5

because it's helpful to see the first 5 lines of the file that I'm getting results for, even though they aren't a "heading" like in the ps example. Maybe my results look like:

whatever.py
1: # whatever.py
2: # This program does the dingdong doodle.
3: # Created by ....
4: ...
5: ...
78: salestax = calculate_tax()
168: print(salestax)

Having those first 5 lines helps give me context for the actual matches. And that said, I think that if "salestax" appears in the first 5 lines, then it should be highlighted like any other ack match.

Another thought: How does --head=N interact with --output?

alexm · 2022-06-17T14:26:37Z

( I use ps -a | perl -nlE 'say if 1..1 or /cronolog/ for such cases but that's me.)

Wow! Didn't know that trick and I like it a lot 😄

I'm assuming that the if 1..1 uses $. implicitly, but I can't find where is this documented. Any pointers?

Thanks, @n1vux!

n1vux · 2022-06-17T16:54:16Z

@alexm , yes, the scalar Range Operator implicitly compares an integer against $. .
This goes back to the early days when Larry was blending the best of shell, libc, sed, and awk into one language, Perl 1 or 2ish, iirc.
Great for -e one-liners, a little too cryptic for a maintainable script and useless in a reusable module.

On the theory of making simple things simple and hard things easier, --head=9 is a good addition for ack .

(Perl Range Op is more flexible: either value can also be a RE /^start\b/i .. /^end\b/i or logical expression, a() .. b() meaning from first line where a() is true to first line where b() is true. And mix and match.
The Range op in list context is DWIMish magic for list constants.)

https://perldoc.perl.org/perlop#Range-Operators

alexm · 2022-06-20T14:54:12Z

The lines of dashes you've shown would be a new feature, right?

Right, but it was just an example of what could be done to highlight the filename without breaking the column layout for commands like ps and the like.

I wouldn't want --head=1 to change any behaviors on how things get output. If you were doing an ack of multiple files and you wanted --head=1, you would probably also have to have a --no-line-number argument as well.

Makes sense. What I'm sensing is that --head=1 has its own place and that some other option --column-names (or whatever) could use --head=1 and -h et al. to achieve what I really was looking for in the beginning.

When you say "skipped from filtering" here, do you mean "skipped from being searched"?

Yes, I meant that, but after reading the case you made later about showing the first N lines of the source files that match a pattern, I guess it makes more sense to search there too.

alexm · 2022-06-20T15:04:31Z

because it's helpful to see the first 5 lines of the file that I'm getting results for, even though they aren't a "heading" like in the ps example

Agreed.

Having those first 5 lines helps give me context for the actual matches. And that said, I think that if "salestax" appears in the first 5 lines, then it should be highlighted like any other ack match.

I changed my mind, you're right.

Another thought: How does --head=N interact with --output?

Good question. My feeling is that when somebody combines both options is because they expect both to be performed. Otherwise, one of them should be removed. Taking the example of the first 5 lines to add context:

show the first 5 lines with text highlighted if there's any match, and then
show the remaining matches as --output dictates.

petdance · 2022-06-20T15:10:22Z

I just realized, maybe --output and --head should be mutually exclusive, and it solves that problem. If you're specifying your own output, then you probably don't want the --head option anyway.

alexm · 2022-06-20T15:43:29Z

I just realized, maybe --output and --head should be mutually exclusive, and it solves that problem. If you're specifying your own output, then you probably don't want the --head option anyway.

That was my first thought 😄

Is there any other option that is mutually exclusive with --output? i.e. to be coherent regarding its intent.

petdance · 2022-06-20T15:55:53Z

Yes, many mutually exclusive options. See mutex_options function.

n1vux · 2022-07-18T19:26:06Z

I don't see a statement of default N, maybe i missed it skimming through.
I would suggest --head without a specific N e.g. N=7 --head=7 should be N=1 , as that's the single most common depth of headers.
(and of course --no-head is the default value.)

n1vux · 2022-07-18T19:30:45Z

Can we set flags in .arckrc for only certain file-types?
I could see value in type=csv → head=1 as a personal option.
I might even set it so, were it possible.
(Getting a line of bad data instead of a header would provide a nasty, implicit warning when a CSV does NOT have a header line!)
(it would be wrong as a drop-in-replacement for grep, of course. Gnu Grep 3.7 does not have this feature. yet.)

alexm closed this as completed Jun 16, 2022

petdance reopened this Jun 16, 2022

petdance changed the title ~~Add an option to show column names (i.e. first line)~~ Add an option to show first N lines (--head=N) Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to show first N lines (--head=N) #359

Add an option to show first N lines (--head=N) #359

alexm commented Jun 14, 2022

petdance commented Jun 14, 2022 •

edited

Loading

alexm commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022

petdance commented Jun 16, 2022

n1vux commented Jun 16, 2022

petdance commented Jun 16, 2022

alexm commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022

alexm commented Jun 16, 2022

petdance commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022 •

edited

Loading

alexm commented Jun 17, 2022

n1vux commented Jun 17, 2022

alexm commented Jun 20, 2022

alexm commented Jun 20, 2022

petdance commented Jun 20, 2022

alexm commented Jun 20, 2022

petdance commented Jun 20, 2022

n1vux commented Jul 18, 2022

n1vux commented Jul 18, 2022

Add an option to show first N lines (--head=N) #359

Add an option to show first N lines (--head=N) #359

Comments

alexm commented Jun 14, 2022

petdance commented Jun 14, 2022 • edited Loading

alexm commented Jun 16, 2022 • edited Loading

petdance commented Jun 16, 2022 • edited Loading

petdance commented Jun 16, 2022

petdance commented Jun 16, 2022

n1vux commented Jun 16, 2022

petdance commented Jun 16, 2022

alexm commented Jun 16, 2022 • edited Loading

petdance commented Jun 16, 2022

alexm commented Jun 16, 2022

petdance commented Jun 16, 2022 • edited Loading

petdance commented Jun 16, 2022 • edited Loading

alexm commented Jun 17, 2022

n1vux commented Jun 17, 2022

alexm commented Jun 20, 2022

alexm commented Jun 20, 2022

petdance commented Jun 20, 2022

alexm commented Jun 20, 2022

petdance commented Jun 20, 2022

n1vux commented Jul 18, 2022

n1vux commented Jul 18, 2022

petdance commented Jun 14, 2022 •

edited

Loading

alexm commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022 •

edited

Loading

alexm commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022 •

edited

Loading

petdance commented Jun 16, 2022 •

edited

Loading