Topic: UNIX Plumbing: Inputs, Pipes, and Outputs
Much of the power and flexibility of UNIX systems stems from their ability to easily combine commands to perform complex tasks without the need to design new programs. This is accomplished by taking output from one command and using it as input to another. One program might produce a list of users on a system, while another sorts it, and a third prints or mails it to someone.
Behind the Scenes
What makes connections between otherwise unrelated programs possible is something very basic that all UNIX programs have in common. Each has a "standard input" and a "standard output," called stdin and stdout respectively. Output from one program can easily be "piped" to the input of another since the inputs and outputs always work the same way. (In addition to stdin and stdout, each program also has a "standard error" output, referred to as stderr.)
It is important that a distinction between input and argument be made, as they are vastly different from a program's perspective. Input is the data a program receives to process or manipulate (on stdin where this discussion is concerned), where arguments are instructions given to a program as part of the command line that control how it operates. When we look at sophisticated examples of combining commands this distinction can become blurry if the differences between input and argument aren't clear.
Applications
In practice there are some very common applications for these "pipes," the most common being the use of a "pager" to display output from a program a screenful at a time for viewing. There are a number of pagers available on most UNIX systems, the most common of which are pg (which was probably the first, and isn't even included with Linux now), more and the comically named less. They all display text sent to their stdin, pausing after each screenful until a key is pressed. Each offers a different set of features but are used the same way:
ls -al|more
The connecting symbol is referred to as the "pipe" symbol and connects the output from the first program (ls in this example) to the input of the second (my pager of choice, more).
You can connect more than two programs using pipes. Starting with our example above, maybe you want to sort the file names in reverse order in the listing before you view them. The ls command doesn't support this but sort does:
ls -al|sort -r|more
The sort command sorts everything it receives on its stdin, sending it out to its stdout when done, in this case to our pager more.
There is also a pipe fitting called tee available. Just as a t-fitting or pipe takes one input and sends it two directions, so does the UNIX tee. If you want to view the output from a program as well as save it in a file, you can say:
ls -al|sort -r|tee saved.list|more
This works just like previous examples except while you're viewing the listing, a copy is being saved in the file saved.list.
Redirection
Many times we simply want the output from a command to be saved in a file. This is just as easy as using pipes; instead of using the pipe symbol, the greater-than symbol is used instead:
ls -al>saved.list
To the right of the > symbol you place the name of the file that should be created with the program's output. One major variation of this exists:
ls -al>>saved.list
The double >> means "append output to this file." If a file with that name already exists the output from the command will be added at the end of the file. The first example with a single > would have erased the original contents of the file.
Remember that I mentioned each command also has a "stderr" output as well? It is separate from stdout and won't be redirected unless you specifically address it:
ls -al>saved.list 2>saved.error
Each output is numbered as well as named and we can use that number to specify an output explicitly. By default, a redirected output applies to stdout. Stderr is specified by preceding the redirect symbol with its number, which is always 2. In case you're wondering, stdin's number is 0 (zero), and stdout is 1 (one). These numbers are called "file descriptors" in UNIX-speak.
A commonly found redirect looks like this:
ls -al >saved.lst 2>&1
This sends output on stderr (output 2) to the same place as output 1 (stdout). Note that this is the proper way to specify this behavior, as the possibly more intuitive:
ls -al >saved.list 2>saved.list
may not behave as expected.
For More Information...
Arranging for command pipes and redirection is the responsibility of the UNIX shell you are running. The examples here have been in Bourne Shell syntax. The manual page for sh is long, but contains exaustive details on various aspects of redirection (the index has a section heading for redirection and an entry under shell grammar for pipelines).
Back to UNIX Command of the Day
|