Discover more from hrbrmstr's Daily Drop
Drop #374 (2023-11-20): Redundancies Redux
good-cat; which which is which; less is more
Redundancies are fine when it comes to file storage, networking, power, etc. But do we really need them at the CLI?
Today, we root out some of these redundancies and see that there may, indeed, be room for them all in our filesystems.
Only one external resource today, so no TL;DR section.
cat command is a standard Unix utility that reads files sequentially and writes them to standard output. Its primary purpose is to concatenate files, but it's often used to simply print the contents of a single file. However, there are times when using
cat is redundant and can be replaced with more efficient alternatives.
A common example of redundant
cat usage is when it's used to pipe the contents of a single file into another command, like
sed. Instead of using
cat file | grep pattern, you can directly pass the file as an argument to the command:
grep pattern file. This eliminates the need for an extra process and can improve performance.
Another instance of unnecessary
cat usage is when it's used in a loop to read a list of items from a file. For example, instead of using:
for i in $(cat file); do … done
you can use a more efficient loop construct like:
while read i; do ... done < file
This avoids the overhead of creating a separate process for
cat and can handle cases where the input file contains spaces or other special characters.
Some folks argue that using
cat in a pipeline makes the command more readable and easier to understand, as it follows a left-to-right flow of data. However, this comes at the cost of efficiency (re: the noted process spawning). In most cases, the performance difference might not be significant, but it can add up when running scripts frequently or processing large amounts of data.
Andrew Lilley Brinker is firmly in the “never use
cat redundantly” camp and made good-cat to help break you out of the habit.
This is not a hill I am willing to die on, but I do think I'm on the “readability” side of this argument.
which which is which
cat command is not the only place we have redundancies at the CLI. We have four, yes four, ways to find out where a given utility is in our systems:
Here they are in action:
$ which R && whereis R && type R && command -v R /usr/bin/R R: /usr/bin/R /usr/lib/R /etc/R /usr/local/lib/R /usr/share/R /usr/share/man/man1/R.1.gz R is /usr/bin/R /usr/bin/R
There has to be a reason we have four of them, right?
which command is used to locate the executable file associated with the given command. It searches for the executable in the directories listed in the
PATH environment variable. If the command is found,
which returns the full path of the executable; otherwise, it returns nothing.
whereis command is used to locate the binary, source, and manual page files for a command. It searches in a predefined set of directories, ensuring a balance between speed and comprehensiveness. Unlike
whereis does not follow your search path and does not require that the files be executable.
type command is used to describe how its argument would be interpreted if used as a command. It can differentiate between shell built-in commands, aliases, and external commands. If the command is an alias, type will show the alias definition. If it's a shell reserved keyword,
type will indicate this. If a shell function, it will display the function's definition. In the case of the argument being a builtin,
type will indicate this. And, if the command is an external file (like a script or a binary), it will show the file's path.
command builtin in Linux is used to run a shell builtin, passing it arguments, and also to get the exit status. The main use of this command is to define a shell function having the same name as a shell builtin while keeping the functionality of the builtin within the function. The
-v option makes
command print a description of the command similar to the
less is more
File pagers are handy beasts and virtually everyone knows about both
more. But, you may discover some things you didn't know about this redundant pair of pagers.
more utility has been around since 1978 and is, for better or worse, my muscle-memory pager. It's ~2K lines of bare-bones C code, so the features are limited, but not too limited. Upon invocation, various commands are available:
?: Show the commands that are available at the more prompt.
ENTER: Display the next line.
SPACEBAR: Display the next screen.
q: Quit the more command.
f: Display the next file listed on the command line.
=: Show the line number.
p <n>: Display the next n lines.
s <n>: Skip the next n lines.
During invocation, various parameters can be used:
/c: Clears the screen before displaying a page.
/p: Expands form-feed characters.
/s: Displays multiple blank lines as a single blank line.
/t <n>: Displays tabs as the number of spaces specified by n.
+/pattern: Searches the string inside your text document. You can view all the instances by navigating through the result.
less utility has far too many options to detail here. But we can hit some interesting ones:
lessto monitor the file contents for changes, similar to
lesscan execute a specified command each time a new file is examined via
+<cmd>. A special case of this is
lessto initially display each file starting at the end rather than the beginning.
it's possible to define your own
lesscommands by creating a
lesskeysource file. This file specifies a set of command keys and an action associated with each key. It pairs well with the
To view a full list of all commands, type
less is running.
While redundancies can cause confusion, it is pretty cool that we have choices when working at the CLI. Hopefully, there was at least one new tidbit for y'all!
Also: many thanks to the new/recent Bonus Drop subscribers 🙏🏽! ☮️