Perl

From LQWiki
Jump to navigation Jump to search

Perl, the Practical Extraction and Report Language is high-level programming language. It's a so-called scripting language and has also been referred to as a "glue language" as well as the "duct tape of the internet". See the perl man page and perldoc perlintro to get started.

Detractors (or else, Larry) may refer to Perl as the Pathologically Eclectic Rubbish Lister.

Sample Perl Program

#!/usr/bin/perl
# This is the prototypical "Hello World!" program, this time
# written in Perl.
use strict;
use warnings;
print "Hello World!\n";

The first line is the shebang line, it identifies the location of perl. If you don't know where it's located, you can type which perl to find out. If perl is not in your path or installed on your system, which will show you nothing. Most distributions come with Perl already installed.

Then there's a couple of comment lines. Note, the very first line of the script (the shebang line) is not a comment (even though it kinda' looks like one).

The use strict line is to help catch when you write code that could later make your life difficult. Always use it. See man perltrap (or perldoc perltrap) for other ways to avoid common pitfalls. use warnings helps catch possible problems also (use warnings is the more modern alternative to passing perl the -w option).

The print line displays the text and the \n is a special character indicating the new line.

Perl Modules

Perl can have modules to provide additional functionality. You can list your perl modules.

CPAN

One of the greatest resources for any Perl programmer is CPAN a which is a collection of perl modules which you can use in your own programs to simplify many things.

Perl and CGI scripting

Perl is commonly used for writing CGI scripts (scripts that run on a web server). If a file is suitably declared, instead of its contents being displayed, it is executed and its output displayed. Data from forms can be passed to CGI scripts.

CGI scripts are usually placed in a directory called cgi-bin. Check your distribution's (or your ISP's) documentation to discover where this directory is -- it shouldn't be in the same directory as your HTML files. Make sure the script has its permissions set to 755 (see chmod for details on setting file permissions).

Simple CGI script

#!/usr/bin/perl
print "Content-type: text/html\n\n";
print "<HTML>\n";
print "<BODY>\n";
print "Hello, world!\n";
print "</BODY>\n";
print "</HTML>\n";
exit;

Note the first line we print indicates a MIME type for our short output HTML file. If httpd were sending a regular file, it would determine what type of file it is from the first few bytes and send a MIME type automatically; but a CGI script is expected to send its own MIME type. (That sounds like a minus, but it also means that a script can easily send additional HTTP headers if necessary.) The second line is a blank line, to terminate the headers. The remainder is just a short, "hello world" HTML file.

A form reader

The following programme reads and displays data from HTML forms. It's useful for debugging purposes, or for using as the basis for a more complicated script,

#!/usr/bin/perl -w
use strict;

my $input_buffer,
my (@post_names,@get_names);
my %parameters;
my ($name, $value);
my $referrer = $ENV{'HTTP_REFERER'};

foreach (split/&/, $ENV{'QUERY_STRING') {
    tr/+/ /;
    ($name,$value) = split /=/, $_;
    $name  =~ s/%(..)/pack('c', hex($1))/eg;
    $value =~ s/%(..)/pack('c', hex($1))/eg;
    $parameters{$name} = $value;
    push @get_names,$name;
};

foreach (split /&/, (read STDIN,$input_buffer,$ENV{"CONTENT_LENGTH"})) {
    tr/+/ /;
    ($name,$value) = split (/=/, $_);
    $name  =~ s/%(..)/pack('c', hex($1))/eg;
    $value =~ s/%(..)/pack('c', hex($1))/eg;
    push @post_names,$name;
    $parameters{$name} = $value;
};

print << "--STOP--";
Content-type: text/html

<HTML>
<HEAD><TITLE>General Purpose Form Reader</TITLE></HEAD>
<BODY>
<H1>General Purpose Form Reader</H1>
--STOP--

print "Referring page was \"$referrer\"<BR><BR>\n";

if (@get_names) {
    print "<BR>Items submitted by GET / query string  (in order):\n";
    foreach (@get_names) {
        print "<BR>"$_" = "$parameters{$_}"\n";
    };
};

if (@post_names) {
    print "<BR>Items submitted by POST method  (in order):\n";
    foreach (@post_names) {
        print "<BR>"$_" = "$parameters{$_}"\n";
    };
};

if (!(@get_names) && !(@post_names)) {
    print "<BR>Nothing was submitted!\n";
};

print "</BODY>\n</HTML>\n";

The first section simply sets up some variables. The hash called ENV is already set up for you by the web server. The next section reads the value of the query string, which is stored in the environment variable QUERY_STRING; splits it into name/value pairs; and parses special characters in the names and values -- this needs to be done after the split, in case one of the special characters evaluates to an '=' (equal) sign. Yes, I made that mistake, and now I'm telling you about it so you don't have to. The names are stored in an array (which is only really necessary for the purposes of keeping the GET and POST variables separate; feel free to drop it in a "real life" script if you don't care how the variables got there), and an associative array is constructed, with the values indexed by name.

The next section reads the POST data, if any, and stores it similarly: names into an array, values into the same associative array. This is passed through STDIN; an environment variable (CONTENT_LENGTH) is set to hold the length of the data. Again, the data consists of a set of name=value pairs, separated by & signs.

Finally, a web page is started, and the GET and POST values are printed in turn.

See also

External links