Lesson 8

How Perl uses Streams and Pipes

In this module you learned how Perl uses streams and pipes to read and write to files and other programs. You saw some specific examples of using streams for both files and pipes to programs. You also got a practical example of both creating a mail form and using a pipe to an existing mail-transfer agent to send the mail.
In the next module you will conclude the course by learning how to debug your CGI programs on a Web server.

A pipe is a unidirectional I/O channel that can transfer a stream of bytes from one process to another. Pipes come in both named and nameless varieties. You may be more familiar with nameless pipes.

Anonymous Pipes

Perl's open function opens a pipe instead of a file when you append or prepend a pipe symbol to the second argument to open. This turns the rest of the arguments into a command, which will be interpreted as a process (or set of processes) that you want to pipe a stream of data either into or out of. Here is how to start up a child process that you intend to write to:

open SPOOLER, "| cat -v | lpr -h 2>/dev/null"
|| die "can't fork: $!";
local $SIG{PIPE} = sub { die "spooler pipe broke" };
print SPOOLER "stuff\n";
close SPOOLER || die "bad spool: $! $?";

This example actually starts up two processes, the first of which (running cat) we print to directly. The second process (running lpr) then receives the output of the first process. In shell programming, this is often called a pipeline. A pipeline can have as many processes in a row as you like, as long as the ones in the middle know how to behave like filters; that is, they read standard input and write standard output. Perl uses your default system shell (/bin/sh on Unix) whenever a pipe command contains special characters that the shell cares about. If you are only starting one command, and you do not need or want to use the shell, you can use the multiargument form of a piped open instead:

open SPOOLER, "|- ", "lpr", " -h" # requires 5.6.1
|| die "can't run lpr: $!";

If you reopen your program's standard output as a pipe to another program, anything you subsequently print to STDOUT will be standard input for the new program. So to page your program's output, 7 you would use:

if (-t STDOUT) { # only if stdout is a terminal
my $pager = $ENV{PAGER} || "more";
open(STDOUT, "| $pager") || die "can't fork a pager: $!";
}
END {
close(STDOUT) || die "can't close STDOUT: $!"
}

When you are writing to a filehandle connected to a pipe, always explicitly close that handle when you are done with it.
That way your main program does not exit before its offspring. Here is how to start up a child process that you intend to read from:

open STATUS, "netstat -an 2>/dev/null |"
|| die "can't fork: $!";
while (<STATUS>) {
 next if /^(tcp|udp)/;
 print;
}
close STATUS || die "bad netstat: $! $?";

You can open a multistage pipeline for input just as you can for output. You can avoid the shell by using an alternate form of open:

open STATUS, "-|", "netstat", "-an" # requires 5.6.1
|| die "cannot run netstat: $!";

But then you do not get I/O redirection, wildcard expansion, or multistage pipes, since Perl relies on your shell to do those.
You might have noticed that you can use backticks to accomplish the same effect as opening a pipe for reading:

print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
die "bad netstat" if $?;