A Perl 5 Overview and Quick-reference
2013-01
Intro
This is a rough Perl 5 quick-reference that contains some brief overview material, recommendations, and reminders regarding how to use Perl 5. It assumes you’ve used Perl 5 before, but maybe have gotten a little rusty. :)
A good place to look for more info on using Perl 5 is the Modern Perl book.
Other good documentation, articles, and books:
- the Official Perl documentation
- http://perl-tutorial.org/
- Damian Conway’s Perl Best Practices (aka “PBP”) book (though, use Moose for OOP)
- Randal Schwartz’s columns
- Perlmonk’s Tutorials
- Stackoverflow Perl-tagged questions
- http://perl.plover.com/
Installation
You may wish to install your own Perl 5 (ex. ~/opt/perl/bin/perl
) instead of using the one that comes with your system (/usr/bin/perl
). The system Perl is often used heavily by various system processes, and it’s probably best not to tinker with it too much.
Tip: Use only your OS package installer (ex.
apt-get
oryum
) to install Perl modules into the system Perl. Use onlycpanm
(or similar) for installing modules into your own Perl.Of particular importance: don’t use
cpanm
with your system Perl. For example, when using Debian, apt knows about what it installs, but wouldn’t know about packages withcpanm
.
Installing your very own Perl 5, OTOH, lets you experiment, tweak, and even break things — you can always just wipe it out and start fresh again, if that becomes necessary.
To install your very own Perl 5, grab the Perl 5 source code and manually install into ~/opt like so:
cd ~/opt
mkdir perl-5.n.m
mkdir perl-5.n.m/bin
cd src # Make this dir if you need to.
cp path/to/perl-5.n.m.tar.gz .
tar xzf perl-5.n.m.tar.gz
cd perl-5.n.m
# rm -f config.sh Policy.sh ## Not necessary for a newly-unpacked kit.
./Configure -Dprefix=/home/<you>/opt/perl-5.n.m
# Answer a lot of questions, accepting the defaults.
# If it complains that it can't find "/usr/bin/less -R",
# tell it to use "/usr/bin/less" instead.
make
make test
make install
# Final step (to be put on your $PATH).
cd ~/opt
ln -s perl-5.n.m perl
(where n and m are the version and sub-version numbers, and <you>
is your home directory name).
Now add the following to your ~/.profile
:
export PATH=$HOME/opt/perl/bin:$PATH
and log out/in again for the change to take effect. Check that you’re getting the perl
you think you’re getting:
which perl # should be ~/opt/perl/bin/perl
perl -v # should be your newly-installed v5.n.m
# Alternatively:
perl -e 'print "$]\n"' # or
perl -e 'print "$^V\n"'
As another option, you can instead use perlbrew to automatically install one or more Perls into your home directory.
Before doing anything else (but after you’ve logged out and back in again so the new perl
is first on your PATH), install the cpanminus tool. I suggest using the clever “bootstrap method”, whereby cpanm installs itself:
curl -L http://cpanmin.us | perl - App::cpanminus
You should now have ~/opt/perl/bin/cpanm present.
As a quick test, have cpanminus install Modern::Perl:
cpanm Modern::Perl
This will install Modern::Perl into your own Perl’s site_perl dir.
Perl doesn’t come with a REPL out of the box. Install this one:
cpanm Devel::REPL
After that completes, you may also need to install a handful of supporting modules for the repl to work smoothly, such as File::Next, B::Keywords, and Term::ReadLine::Gnu. After that’s done, try it out:
$ re.pl
Here are some commonly-used modules you also might install right off the bat:
DateTime Moose Config::Tiny Template Try::Tiny
IPC::System::Simple Capture::Tiny File::Slurp
and for database use, probably also:
DBI DBD::SQLite
For creating and distributing your own Perl 5 modules:
Module::Starter Dist::Zilla
Finally, you may want to set up a local Perl module directory in your home directory for installing modules which you’d rather not put into your Perl’s site_perl dir (for example, more specialized, experimental, or project-specific modules). For this, use local::lib.
Local Perl docs
You can get at your perl docs using the perldoc
command. They are also available online as html at http://perldoc.perl.org/.
To see the master ToC, see perldoc perltoc
.
To see all the built-in functions grouped by category, see perldoc perlfunc
and page down a couple of times.
You can always jump straight to the docs for a given function using the -f
option, for example: perldoc -f sort
. You can read the perldocs of a local file like so: perldoc ./myfile.pl
(you can use the -F
option here to speed things up).
Go to perldoc perlfunc
and go down a page or two to find a handy arrangement of Perl functions grouped by category.
See perldoc perl
to get the big table of contents of all the available perldoc pages.
At the top of every script/module
Always (unless you have a good reason not to) start your scripts and modules with use strict
and use warnings
, or else use Modern::Perl
(which does those for you, plus a bit more).
Language fundamentals
Expressions are bits of code that
perl
evaluates to some value. They are made up of terms and operators.Statements tell the interpreter to do something, and are made up of expressions.
Declarations are like statements, but only tell the interpreter to learn something.
Blocks are one or more statements separated by semicolons and delimited as a whole by braces.
The
$
,@
, and%
sigils are for scalar, array, and hash expressions, respectively. Variables look like$foo
,@bar
, and%baz
.Strings can be ‘single-quotish’ (hard strings) or “double-quotish” (soft-strings). There’s also alternate syntax:
q{single-quotish}
andqq{double-quotish}
Double-quotish strings allow escapes and variable interpolation. You can interpolate with curlies in there too if necessary,
"like ${this}tastic."
.You can interpolate an item of a list (
@ar = qw/this am and or but/
) into a string"like $ar[0] for ex$ar[1]ple"
.Variables represent the value itself — they are not “references” to the values unless you explicitly make a reference (ex.
my $foo = \@a
).When you do
my @b = @a;
, you’re making a shallow copy of@a
. To make a deep copy, use Clone or Storable (see itsdclone
function).Strings can be modified in-place (for example, using
s///
,chomp
, andsubstr
). Note, you can’t index into a string as if it were an array — it’s a $calar, not an @rray.Use
length $a_string
to get the length of a string. For the length of an array, just evaluate it in scalar context. To get the length of a hash, doscalar keys %h
.$_
is the default arg in a number of places, for example:- default item in
for
and “while (<>)
” loops - default arg to
chomp
,print
,say
, and others m//
matches against it unless you use the binding operator (=~
).- likewise with
s///
- default item in
You use a different set of operators for working with numbers than with strings:
numerical op | string op |
---|---|
= | eq |
!= | ne |
< | lt |
<= | le |
> | gt |
>= | ge |
<=> | cmp |
* | x |
+ | . |
Use the dot “.” to do string concatenation. There’s no op for concatenating arrays; you just write them together:
(@a1, @a2)
.=>
is the fat comma. It’s like a regular comma, except that it autoquotes what’s to the left of it if it’s just a simple identifier...
is the range operator. Works for numbers, and characters too (ex. ‘a’ .. ‘z’).<>
is the line input operator, a.k.a the angle operator, a.k.a the readline function.You can put underscores in number literals, as in
1_000_000
,0x0000_1111
,0b11_00_11_00
, etc.Perl 5 has no boolean literals. If necessary, just use 1 and 0 to mean true and false.
undef
,0
,0e0
(that is, 0×10⁰) (well, any 0×10ⁿ (0e1, 0e2, …)),q{}
(the empty string) and'0'
(the string) are all falsey values. They evaluate to false in a boolean context. All other scalars are truthy.Note that there’s a difference between expressions as they appear in your source code (at compile-time), and the values that the interpreter evaluates them to (at runtime). Context is determined at compile-time.
A list is something that exists at runtime, in the Perl interpreter. In english, when we see a number of things separated by commas, we tend to call that a list. However, when discussing Perl compile-time expressions, it’s more accurate to call a bunch of things separated by commas a “comma expression”.
An array is one of
@these
in your code. What it gets evaluated to at compile-time depends on the context.Empty lists (
my @a = ();
) and empty hashes (my %h = ();
) evaluate to false in a boolean context. If either have anything in them though, they’re true.print
only prints what you tell it to. It won’t put spaces between the args you pass it, and you need to include a"\n"
if you want one.say
tacks on a newline for you.By default, arrays interpolate into strings with their elements separated by spaces (which makes them easy to print:
say "the items are: @ar";
). Hashes don’t interpolate; if you want a string representation of a hash, maybe use Data::Dumper, Data::Printer, or Dumpvalue.Parentheses are used for all grouping, lists, and hashes. You also need to put parens around an
if
,while
,for
, etc. condition expression.hashes:
- creating one:
my %h = (foo => 2, bar => 4);
- setting a key/value pair:
$h{baz} = 3;
- accessing a value:
my $bar = $h{foo};
.
In list context, a hash unwinds into one long flat list of key/value pairs:
my @a = %h;
. (So, if you know all your values are unique, you can invert a hash like so:my %inverted_h = reverse %h;
.)- creating one:
For hashes,
exists
tells whether or not the key is even there.defined
tells if the value (for an existing key) is defined or not:if (exists $h{$key} ) ... # Is $key in this hash? if (defined $h{$key} ) ... # Is its value defined? if ( $h{$key} ) ... # Is its value true?
splice
is for arrays;substr
is for strings. Also note:delete
is for hashes and hash slices — to remove items from arrays, useshift
,pop
, andsplice
. You may delete the current item from a hash while iterating over the hash (seeperldoc -f each
), but don’t try that with an array.Perl 5 doesn’t have a “set” data type. For that, use either Set::Scalar or else use a hash (and ignore its values).
Define a function like so:
sub foo { ... }
. Some quick notes:Within a sub, you can just refer to (and change) globals normally.
Last expression evaluated is what gets returned, however, it’s probably better style to always use an explicit return (and use a bare return to indicate failure (thanks PBP)).
Args get passed in by reference via
@_
. If you want local copies of them, do:my ($foo, $bar) = @_
. Although@_
is a local, its contents ($_[0]
,$_[1]
, etc.) refer to the variables in the caller’s scope — they are aliases to them.Subroutines are package globals.
You can call built-ins as functions or as operators. If you call them as functions (with explicit parentheses), they have very high precedence. If you call them as operators (no parens) they have very low precedence.
Take a reference by adding a backslash in front of the variable:
my $foo_ref = \@my_array;
.The special syntax for a literal array ref is
[...]
, and for a hash ref it’s{...}
.Dereferencing:
my $foo_ref = \@my_array; # Take a ref of @my_array. my @a2 = @{ $foo_ref }; # Dereference $foo_ref.
That is, you put a reference inside
${}
,@{}
, or%{}
to dereference it. The braces are sometimes optional, but I like to include them, for clarity.There are shortcuts for dereferencing. Observe:
my %foods = ( 'good' => ['beets', 'spinach', 'carrots'], 'bad' => ['twinkies', 'devil dogs'], 'ugly' => ['gruel', 'slop'], ); my $f = ${ $foods{good} }[1]; # (spinach) Not using any shortcuts. my $g = $foods{bad}->[0]; # (twinkies) Using the arrow shortcut to dereference. my $h = $foods{ugly}[1]; # (slop) Perl lets you omit the arrow here. # Also: my @ar = (['a', 'b', 'c'], [1, 2, 3], ['foo', 'bar']); my $s1 = ${ $ar[0] }[1]; # Not using any shortcuts. my $s2 = $ar[1]->[2]; # Using the arrow shortcut. my $s3 = $ar[2][1]; # Perl lets you omit the arrow here.
Use
map
andgrep
to easily build lists from other lists (block forms are preferable).You can stash data at the end of your file after a line that has
__END__
on it. Access that data via theDATA
filehandle. If it’s binary data, base64 encode it first (seeMIME::Base64
).Use
die
to write to stderr and exit. Usewarn
to write to stderr but not exit. If writing a module, use the Carp equivalents to give more info to the users of your module.
Context
When perl
is evaluating an expression (at compile-time), what type of value it expects to find (scalar or list) depends upon context.
Context is determined at compile-time when perl
parses your source code.
- If
perl
is expecting a given expression to be a scalar, it tries to evaluate it such that it provides a scalar. - If
perl
is expecting a given expression to be a list, it tries to evaluate it such that it provides a list.
If a scalar is put into a list context, it usually produces a one-element list.
If a list-producing expression is put into a scalar context, it will hopefully evaluate to something useful. For example, an array will yield the number of elements it contains.
You can force a scalar context by using the scalar
operator.
In the docs, when you see something like “sort LIST
”, it means that the sort
operator provides a list context to its arguments. Furthermore, if the operator provides a list context to an argument, it also provides a list context to the elements of that list argument.
Lexicals, globals, and scoping
Perl provides 2 kinds of namespaces for variables to live in: “package” (i.e. “symbol tables”), and lexical. Package variables are globals (aka “package globals”), are dynamically scoped, and live in named symbol tables. Lexicals are locals and live in unnamed lexical scopes. File scope is the largest possible lexical scope.
When one subroutine calls another, the one being called is in the dynamic scope of the caller. When one block is inside another, the inner block is in the lexical scope of the outer one. Note, however, that when you get to the end of a block where you started, you leave the current lexical and dynamic scopes.
If you like, you can create your very own secluded lexical scope with an empty block:
say "Noisy outer scope!";
{
# Nice quiet local scope in here.
my $i = 3;
say $i;
}
# say $i; ERROR
Symbol tables are actually global hashes, and contain the names of the variables in them. All the built-in globals (like
@ARGV
,%ENV
,$$
, etc.) are located in the main symbol table.Incidentally, within each namespace there’s a sub-namespace for each sigil (that’s why
$foo
and@foo
are 2 different variables). If you want to refer to every “foo” in a symbol table — regardless of sigil — you use a “typeglob”. Perl uses typeglobs to implement the importing of modules.
A fully-qualified package variable name like $Foods::Veg::Tomato::variety
shows you the structure of the nested symbol tables — the most deeply-nested of which contains the global $variety
scalar. The fully-qualified name also indicates that the file path leading to Tomato.pm
is Foods/Veg/Tomato.pm
.
Note: you can’t see local/lexical/my variables in a module from outside that module.
When you call use
, it often imports package symbols (or else gives the compiler some hints (as “pragmas”)) for use in the current lexical scope.
A package
declaration (usually at the top of a module) is lexically scoped, and declares the name of the current default package until the end of the current lexical scope (usually the file).
Recall that
use
happens at compile-time.require
happens at run-time.
Aside from package
and use
(and require
), the three operators dealing with lexicals and package globals are my
, our
, and local
:
my
declares a lexically-scoped variable. It’s name and value are both stored locally, only.
our
declares a lexically scoped name that refers to a package global. Using the above example, the Tomato.pm
file would contain our $variety;
in it. our
does not create values — it just gives you access to the global, though, you usually give access to and create it at the same time, ex.:
our $foo = 7;
Note: “our
” replaces “use vars
”.
local
sets up a temporary value for a package global — but only for the current dynamic scope. That is, if you use local
in a sub, and then call another sub, in that 2nd sub you’re still in the same dynamic scope, and so will still see that localized value. Once the 1st sub returns, you’re back to the pre-localized value. Of course, same thing happens if you come out of a lexical scope — you’re back to the value that the package global had before the scope you were just in.
Only use local
if you have a good reason to.
More on operators
A little more on operators:
They come in three flavors: unary, binary, and trinary.
The things operators work on are called “terms”.
Autoincrement and autodecrement have a little extra magic when dealing with alphanumeric strings.
The
->
is a binary infix dereference operator when used like so:$a_ref->[0] $h_ref->{foo} $s_ref->('bar')
Here’s an example of using it with a reference to a function:
sub foobar { ... }; my $fn_ref = \&foobar; ... &{$fn_ref}(); # calling the function $fn_ref->(); # same
Otherwise, the arrow is used for method calls, like:
my $f = $Foo->new(); $f->bar();
Use
**
for raising a number to a power.=~
is the regex binding operator.=~
is for “match”, and its cousin!~
is for “doesn’t match”. These binding operators have a pretty high precedence.You can get a list of n things like so: “
my @a = ('whatever') x $n;
”.To get “=====”, do: “
my $s = q{=} x 5
”.Among the assignment operators, note the presence of
||=
,//=
,.=
, andx=
.In list context, the comma is just as separator. In scalar context, it’s an operator, but not one you’d normally use.
You can make a reference to a list of words like so:
[ qw/foo bar baz/ ]
For more, read perldoc perlop
.
Keywords
The details are in perldoc perlsyn
. Among others, you’ll find info on:
if
,unless
,elsif
,else
for
,while
,until
next
,last
,redo
continue
(seeperldoc -f continue
)
Note, in Perl you can label loops, if you like.
LINE:
for my $line (@lines) {
#...
next LINE if $thus_and_so;
}
BTW, $line
is lexically scoped to that for-loop block. Same thing with while
loops like this:
while (my $r = get_thingy()) {
# do something
}
say "it was $r!"; # ERROR: $r is out of scope here.
Incidentally, a bare block:
{
print "hi\n";
}
is the same as a loop that only loops once. It’s contents have their own lexical scope, and you can exit it using last
(or even start it over using redo
). if
and unless
blocks are, of course, not loops.
When looping over a list, the variable used for each item ($_
being the default) actually aliases the item. So, you can change the list items on the fly. However, don’t delete items on the fly — for that, append to a separate array instead.
Built-ins
Perl comes with a good number of built-in functions. They’re also sometimes referred to as operators when you don’t put parentheses around their arguments. When you use a built-in that only takes one arg, and you don’t use parentheses, it’s called a named unary operator.
See a nicely-organized list of them at perldoc perlfunc
.
Standard library
Perl comes out-of-the-box with many modules in its standard library. To see the list of them, run perldoc perlmodlib
.
Commonly-used variables
Besides $_
, a few of them are:
@ARGV
— args passed into this script. (Recall, the program’s name is stored in$0
, not in$ARGV[0]
.)%ENV
— holds environment variables.%INC
@INC
%SIG
— to set up signal handlers.$?
— seeperldoc perlvar
under$CHILD_ERROR
$!
See chap. 28 of the Camel for more, or read perldoc perlvar
.
Quoting
qq{}
(""
),q{}
(''
)qx{}
(``
)qw{}
m//
,s///
,tr///
qr{}
See the perldoc perlop
section “Quote and Quote-like Operators” for more info.
If you need a multi-line string, use the heredoc syntax:
my $long_string = <<"END_OF_STRING"; # Note explicit quoting.
la dee da
va va va $voom
ok, done
END_OF_STRING
process_long_string($long_string);
process_long_string(<<'EOS');
line one
line two
line three
EOS
For dealing with regexes
m//
,s///
- Captured groups go in
$1
,$2
, … - There’s also
$`
,$&
, and$'
for pre-match, match, and post-match.
Remember that, inside the current regex (i.e. during the match), you use \1
. Outside the match, you use $1
. For s/foo/bar/
, you’d use $1
(not \1
) where “bar” is.
The /g
regex modifier is for globally finding matches. What the pattern match yields is described in the following sections.
It’s probably best practice to always use “/xms” at the end of your regexen.
Mostly, what you’ll be doing with regexes is:
checking if a string matches a regex:
if ( $s =~ m/.../xms ) {...}
doing a search/replace:
$s =~ s/foo/bar/xms;
(add “g
” to the xms to replace all)iterate over a bunch of strings you find in some long string:
my $long_str = <<'EOS'; line one foo11bar baz line two foo283bar moo line three foo321bar yo EOS while ( $long_str =~ m/ foo (\d+) bar /gxms ) { say $1; }
Resulting value after an attempted match (m//)
In scalar context
Without
/g
:- If a match, returns true (1).
- If no match, returns the empty string (false).
With
/g
(“progressive match”):- If a match or no match, same as without
/g
. However, each subsequent request for a match moves the position pointer to just after the previous match.
- If a match or no match, same as without
In list context
Without
/g
If a match, returns the list of matches captured by the grouping parentheses (if there’s no grouping parens, then returns
(1)
).If no match, returns the null list.
With
/g
:If a match, and no grouping parentheses, returns a list of all matches found. If there’s parens, returns the strings captured.
If no match, same as without
/g
.
Regarding s///
You can use /g
with s///
as well, and it does what you’d expect. Regardless, s///
returns a number telling how many times it succeeded in doing the replacement. But note, s///g
in scalar context is not progressive like m//g
is — you need to manually loop for something like that.
There’s a lot more to regexes, of course. For details, see chapter 5 of the Camel, and/or perldoc perlre
.
Object Oriented Programming
Use Moose. Or, for something much smaller, faster to start up, and less featureful, see Moo).
Basic Moose usage: To create a WoodStove class, put the following into WoodStove.pm:
package WoodStove;
use OtherModules;
use YouMightNeed;
use Moose;
use namespace::autoclean;
# extends, roles, attributes, etc.
# methods
no Moose;
__PACKAGE__->meta->make_immutable;
1;
Files
Some functions: open
, close
, chdir
, glob
, unlink
, rename
, mkdir
, rmdir
, …
Regarding open
:
use autodie qw/:all/;
open( my $in_file, '<', 'input.txt' );
open( my $out_file, '>', 'output.txt' );
my $one_line = <$in_file>;
my @all_lines = <$in_file>;
print {$outfile} @all_lines; # Note extra braces and no comma.
close $outfile;
# Easiest way to read in all lines of a file:
use File::Slurp qw/slurp/;
my $one_big_string = slurp $in_file;
my @all_lines = slurp $in_file;
Note: you don’t need to check for success when opening files (those first 2 lines) if you’re using
autodie
.And you should indeed be using
autodie
.
You can do tests on files using their filename and the various “-x
” tests such as -r
(is readable), -w
(is writable), -e
(exists), and so on. For the full list of tests, see perldoc perlfunc
— they’re listed near the top.
Note that glob
has some special magic: if it’s in the condition of a while
, for
, or until
loop, each time through it’ll give you the next filename.
while (my $fn = glob '*.txt') {
say "Found $fn";
}
This is analagous to the magic of the line input operator.
Processes
Aka, “shelling out”.
Use
system
when you don’t need to capture the output of the program you’re running, and when may need to interact with it (via stdin/stdout). Return value is the exit status of the program you shelled-out to.Use backticks for running a shell program and capturing its output (ex.
my $foo = `date`;
). What’s between the backticks is double-quotish.
See also the docs for IPC::System::Simple and Capture::Tiny.
Exception handling
See perldoc -f eval
and perldoc -f die
.
POD
Perl 5 POD is pretty no-frills. Blank lines separate paragraphs. You indent what you want rendered verbatim. You can get I<italic>
, B<bold>
, and C<monospace>
. Headings are made with =headn
where n is 1, 2, 3, or 4. Lists are made with =over
, =item *
(or =item foo
), and then with a =back
at the end of the list. End POD with =cut
. It works ok for manpage-style docs.
Perl 6 Pod (see Synopsis 26) is the newest standard for Pod.
That said, this quick reference is written in Pandoc-Markdown.
Packages, Modules, and Distributions
A distribution is the tar.gz file you can download from the CPAN (which cpanm
downloads for you). These are typically named “Like-This-1.02.tar.gz”.
A module is a .pm file that you can use
in your code. It may contain zero or more packages (described below).
A package is a namespace. At the top of your module you can specify the current package name which sets the name of the default package for whatever follows. Packages usually have a version number as well.
package MyPackage;
our $VERSION = 0.01;
(Think of version number i.NNNN like version i.j.k where j and k can be up to 2 digits and are padded with a zero if < 10.)
You generally use CamelCase for package and module names. You can specify more than one package per file, but it’s simpler to have just one per file. That is, one .pm file == one module == one package. Keep it simple.
You name your module the same as the tail end of the package name, but with “.pm” at the end. Further, if the package name contains ::
, you place the file in the corresponding nested directory. So, for package Foo::Bar::Baz;
, you’d have ~/perl5lib/Foo/Bar/Baz.pm
. For your own simple modules, there’ll usually be no colons in the package name, and you’ll just drop the files directly into your ~/perl5lib
(no subdirs needed or required).
You’ll need to have use lib '/home/you/perl5lib'
in your source file for it to find your own modules.
Checking if you have a module installed
# If you can `use` it, it's installed.
perl -MSome::Module -e 1
# If you can read its docs, it's installed.
perldoc Some::Module
List all core modules installed
To see the full list of core modules for a given version of Perl:
corelist -v 5.n.m
(See corelist -v
for the full list of versions that corelist
knows about.)
List all extra modules you’ve installed
perldoc perllocal
Checking if a given module comes standard with Perl
Use the corelist
script that comes with Perl, ex.:
corelist List::Util
Find out where a given module is installed
perldoc -l Some::Module
More Module Management Tools
See pmtools.
Uninstalling modules
Tricky.
You might opt to always install modules into your own ~/perl5lib (say, via local::lib). This way, you keep your Perl’s site_perl dir relatively clean, and you can tinker around with your ~/perl5lib dir as you wish. Then, if you really botch up your ~/perl5lib, you can always wipe it and start over without harming your Perl installation.
Using modules
To use modules, whether they come with the standard library or from elsewhere, you just use
them near the top of your file like so:
use Foo::Bar; # Import whatever Foo::Bar exports.
use Foo::Baz qw/func1 func2/; # Only import func1 and func2.
use Foo::Moo (); # Do not import anything.
You’ll sometimes see something like:
use Foo::Bar -moo;
The two things going on there are:
- Putting a dash in front of a bareword does a little magic and makes it into a string which starts with a dash.
- Putting a lone string where a list is expected evaluates to a one-item list.
Where does Perl search for modules?
Paths are stored in @INC
. See them yourself: perl -MModern::Perl -e 'for (@INC) { say; }'
Installing modules in a local directory
You can use cpanm
to install modules into your own local ~/perl5lib directory (just like how it normally installs into the site_perl
dir) by first installing and setting up local::lib. Follow the “bootstrap method” to install local::lib into your own ~/perl5lib dir.
TODO: more instructions here.
Writing your own procedural modules
For simple modules, just put this into your /home/<me>/perl5lib/Foo.pm
file:
package Foo;
our $VERSION = 1.01;
my $whatever; # Only visible inside this module.
our $bar; # Can be accessed outside via `$Foo::bar`.
# Access outside via `Foo::baz()`.
sub baz {
#...
}
1;
No need to use Exporter if it’s just a simple module that you don’t wish to export anything from.
From other scripts, to access Foo.pm’s globals and functions:
use lib '/home/me/perl5lib';
use Foo;
say $Foo::bar;
Foo::baz();
For anything more complex, or for modules you wish to distribute, see Module::Starter and Dist::Zilla.
Some Perl idioms
Easily generate an arrayref from a list of words:
[qw/foo bar baz/]
Assigning to multiple variables at once:
my ($foo, $bar, $baz, $moo) = ('XX', 38, [qw/a b c/], 'yo');
When you need to do a number of search-replace operations on a given string:
for ( $st ) { s/foo/bar/; s/baz/moo/; s/qux/quux/; }
Randomly picking one item from a list:
my @words = qw/foo bar baz/; my $word = $words[ rand(@words) ];
Looping over a hash:
while ( my ($k, $v) = each %h ) { # ... }
Some common tasks
Files, line-by-line
To do something with a file, line-by-line:
use autodie qw/:all/;
open(my $fh, '<', 'foo.txt');
while (my $line = <$fh>) {
chomp $line;
# ...
}
close $fh;
or
use File::Slurp qw/slurp/;
my $one_big_line = slurp 'foo.txt';
my @lines = slurp 'bar.txt';
chomp @lines;
Shell output line-by-line
for my $line (`ls -l *.txt`) {
chomp $line;
say ">>>$line<<<";
}
Date and time
my $sse = time; # Seconds since epoch.
my ($hours, $minutes, $seconds) = (localtime)[2,1,0];
my ($year, $month, $day) = (localtime)[5,4,3];
$month++; # Necessary.
$year += 1900; # Necessary.
printf "%u-%02u-%02u\n", $year, $month, $day;
# See `perldoc -f sprintf` for details on that format string.
my $nice_string = localtime;
# You can also pass `localtime()` an sse value.
my $earlier = localtime($sse - 120); # 2 minutes ago.
For anything more complicated than that, use DateTime.
Random
rand() # 0.0 to < 1.0
rand(10) # 0.0 to < 10.0
int(rand(10)) # 0 to 9 (`int` truncates)
Database access
SQLite
Make sure you have SQLite installed:
sudo apt-get install sqlite3 libsqlite3-dev
Then:
cpanm DBI
cpanm DBD::SQLite
XML
Use XML::Twig. See also XML::LibXML.
Other tips and best practices
One good place to look for a list of Perl best practices is the book "Perl Best Practices, by Damian Conway. If you haven’t already read it, you probably want to read it. Here’s some tips (some of which are from PBP):
Always
use strict
anduse warnings
.Use
my
for variables you want to be local,our
for ones you want to be global.Always use parentheses when calling non-built-in functions.
Only use the
unless
statement modifier when the part that comes first is the usual case, and the modifier is for an “of course, in the unlikely event” situation. For example:go_to_store() unless $hurricane_outside;
You can also use modifiers with the various loop operators:
while (<>) { next if m/$bad_coffee_cake/; last if m/$tomato_too_soft/; #... }
Regex tips:
Possibly except for very simple matches, always use
/xms
. Then\A
and\z
are beginning and end of string.Prefer
m//xms
to//xms
.Use non-capturing parentheses “
(?:...)
” when you want grouping but no capturing.Consider using canned regexen via
Regexp::Common
sometimes.
Don’t
use constant;
.use Readonly;
instead. You’ll need to grab it from CPAN.Always quote your heredoc marker after the
<<
.Use the fat comma for pairing.
If you need to change the value of a punctuation variable, always localize it first.
use English
for the less-familiar punctuation variables.When you really do need to know indexes of values in a list:
my @ar = qw/foo bar baz moo poo qux/; while (my ($idx, $item) = each @ar) { say "$idx: $item"; }
or, the more old-fashioned way:
my @ar = qw/foo bar baz moo poo qux/; for my $idx ( 0 .. $#ar ) { say "$idx: $ar[$idx]" }
A named lexical iterator variable in a while loop looks like:
while ( my $line = <> ) { next if $line =~ m/^#/; }
Label your loops if you’re using
next
,last
, orredo
in them.Always use a block with
map
andgrep
.Call your own functions with parentheses and without
&
. In fact, you probably shouldn’t ever call a function using the&
except for when doing it by reference (as it&{$foo}($arg1, $arg2)
), and in that case, consider using the arrow instead.When writing a function that takes more than 3 args, use a hashref to pass them in with names. As in:
sub foo { my ($arg_ref) = @_; # $arg_ref->{bar} # $arg_ref->{baz} # $arg_ref->{moo} # $arg_ref->{qux} #... } foo( {bar => 'the bar', baz => 'shirley temple', moo => 'love boat', qux => 'Isaac', });
Check for hash key presence like so:
my $ans = exists $q_for{ans} ? $q_for{ans} : 42;
To read in the contents of a whole file as one long string, use
File::Slurp qw/slurp/
. The old-fashioned way was to domy $file_contents = do { local $/; <$infile };
.
Modules to make use of
Aside from many useful built-in modules, including:
- Cwd
- autodie (see note below re. IPC::System::Simple)
- List::Util
- Data::Dumper
- File::Basename
- File::Copy
- File::Path
- File::Temp
- File::Find
- File::stat
- Getopt::Long
- Test::More
here’s a few (in no particular order):
- those listed in Task::Kensho
- Scalar::Util, List::Util, and List::MoreUtils (but not Hash::Util) (PBP chp. 8, p. 170)
- Moose for OO development. Also consider MooseX::Declare.
- You can IO::Interactive’s
is_interactive()
function, or just have IO::Prompt take care of everything for you. - Carp and Carp::Always (when writing your own modules)
- Getopt::Long for command-line option processing
- Module::Starter, though check out Module::Starter::PBP. Wait: see also Dist::Zilla
- Module::Build (used in your Build.PL file).
- Config::Std or Config::Tiny.
- Test::Simple or Test::More (or Test::Most).
cpanm
for installing modules (possibly use with local::lib)- Modern::Perl
- DBI, DBD::SQLite, DBD::mysql, DBIx::Class (aka “DBIC”)
- DateTime
- File::Slurp
- IPC::System::Simple (for use with autodie)
- Capture::Tiny
- Try::Tiny (or maybe TryCatch)
- Path::Class (instead of File::Spec)
- Config::Tiny
- Term::ANSIColor — because who doesn’t like colors in their terminal?
:)
- Regexp::Common
- Perl::Tidy (
perltidy
) - Perl::Critic (
perlcritic
) - Devel::NYTProf
- Devel::Cover
- Template
- Text::Autoformat
- Email::Sender::Simple
- GD, GD::Graph
- Image::Magick
- Gtk2
- Plack, PSGI, Starman
- Mojolicious, Dancer, Catalyst
- Net::OpenSSH
- WWW::Mechanize
- App::Ack
For more ideas, check out
- http://www.perlfoundation.org/perl5/index.cgi?recommended_cpan_modules
- http://www.perlfoundation.org/perl5/index.cgi?recommended_modules_for_programming_best_practices
- https://www.socialtext.net/perl5/pbp_module_recommendation_commentary
- https://www.socialtext.net/perl5/list_of_perl_modules_comparison_articles
- http://cpanratings.perl.org/