Table of Contents
Perl Cheat Sheet
Perl is a highly capable, feature-rich programming language with over 25 years of development. These notes follow the path of the Learning Perl book.
- First line should be #!/usr/bin/perl or #!/usr/bin/env perl
- Perl compiles to byte-code before executing
- Do not use an extension for scripts.
- To specify a minimum version use “use 5.010;”
- Semicolons (;) required to separate statements
- “perl -w” turns on warnings for entire program. For just the current file use “use warnings;”
- “use diagnostics;” adds explanations to warnings
- “use strict;” requires variable declaration
- Comments start with # (no block comments)
- Parenthesis are optional unless part of syntax.
@ARGV
contains arguments, $0 contains program name.die
/warn
can be used to exit/warn, $! will contain any system error message. Without \n at the end, perl will append line number to error message.
Numbers
- All numbers are stored in the same format e.g.-6.5e24, 037 (octal), 0x2a13b (hex), 0b11010 (binary)
- Underscores for readability are ignored e.g. 0x1377_0B77
- Arithmetic operators are * / + - % (modulus) ** (exponentiation)
- Comparison operators are < ⇐ == >= > !=
Strings
- To use unicode in source code - “use utf8;”
- Single-quoted literals '⅚∞☃☠' can contain any character except single-quote (use \') and backslash (use \\).
- Double-quoted strings can contain control-characters starting with a backslash and are variable interpolated e/g/ “${var}s $var\n”.
- . is for concatenation, x for string repetition.
- To insert a character by code - chr(0x05d0). To convert a character to its code - ord('א')
- String comparison operators are lt, le, eq, ge, gt, and ne
Scalar Variables
- Scalars are numbers or strings (or undefined or references), used interchangeably.
- $calar variables start with $ and a letter or underscore, but can contain letters (including unicode), numbers or underscores.
- Scalar assignment looks like $a = 3; or $a .= “suffix”; or $a **= 3;
- Recommended to use all lowercase variable names with underscores for word separation e.g. $var_name
- print takes a single scalar or a list of scalars (separated by commas)
- When a scalar is interpreted as a boolean, the following are false : 0, '0', '', undefined .
- !! is a handy shortcut to convert any scalar to 0 or 1 (to represent a boolean)
- Variables have value
undef
before being assigned.undef
acts as a 0 or “” as needed, but will throw a warning if printed. defined()
checks if a variable has been defined. Can also set a variable toundef
.
Lists and Arrays
- A list is an ordered set of elements (each of which is a scalar value), starting with position 0.
- @rrays are variables that store a list (separate namespace from scalar variables). @x refers to entire array x.
- Undefined arrays start out as
()
, the empty list and notundef
. - To access an element of an array -
$myarr[0]
-undef
if never set. - Index of last element is in
$#arr
- Negative array indices wrap around (only once) so -1 refers to last element.
- Literal lists -
('abc','def')
or(1..6)
(integers 1 to 6) orqw(abc def)
orqw#abc def#
(quoted by whitespace). - Can assign list values to variables e.g.
($fred, $barney, $dino) = ("flintstone", "rubble", undef); ($fred, $barney) = ($barney, $fred); # swap those values
push
adds element(s) to end of array.pop
removes a single element and returns it.unshift
adds element(s) to start of array.shift
removes a single element and returns it.splice
removes elements from middle of array, returns them and optionally replaces themsplice @arr, start, len, @newelems
sort
andreverse
functions return the modified list (can be saved to original array variable).- Expressions parsed in either a scalar context or a list context. Scalars are promoted to single-element lists in list context.
- List functions may return different scalars - array variables return number of elements. The
scalar
function forces a scalar context e.g. for print function.
Hashes
- A hash is a list indexed by a string (key) - a collection of key-value pairs.
%hash
has its own namespace. - Uses scalable, efficient algorithms. Used to be called associative arrays.
$hash{$some_key}
to access a value (curly-braces instead of square braces).- Assigning a hash to an array unwinds (flattens) it.
- To initialize a hash :
%some_hash = ('a', 1, 'b', 2, 'c', 3); %some_hash = ( 'a' => 1, 'b'=>2, 'c'=>3,);
- When using a big arrow (a fat-comma) or when accessing a value, simple keys don't have to be quoted (barewords) e.g.
%some_hash = ( a => 1}; $some_hash{a};
%revhash = reverse %hash
to reverse a hash (for non-unique values, last one wins).keys %hash;
andvalues %hash;
return a list of keys or value in same order (or # of keys/values in scalar context).%hash
is true only if hash has at least one key-value pair.- %ENV hash holds environment variables.
Control Structures
- if-elsif-else
if ($a == $b) { a = 0; } elsif ($a > $b) { a = 1; } else { a = -1; }
- while loop
$count=0; while ($count < 10) { $count += 2; }
- foreach loop
foreach my $rock (@rocks) { # modifications to $rock modify the list element # $_ is used if loop variable is omitted }
Input/Output
- Read a line of input -
$line = <STDIN>;
chomp
removes a newline e.g.chomp($line=<STDIN>);
- Assigning <STDIN> to a list reads all input up till EOF e.g.
chomp(@lines = <STDIN>);
- Loop over input ($_ can only be used in this specific case)
while (<STDIN>) { print "I saw $_"; }
foreach
can also be used but uses more memory since it reads all input into memory first.- Filehandles can be barewords (upper-cased) or variables. Special filehandles are : STDIN, STDOUT, STDERR, DATA, ARGV, and ARGVOUT .
open CONFIG, '<dino'; # < is optional open BEDROCK, '>fred' || die "Cannot open fred: $!"; open LOG, '>>:encoding(UTF-8)','logfile'; # for perl >= 5.6 open my $bedrock, '>:crlf', $file_name; # DOS-formatted output binmode STDOUT, ':encoding(UTF-8)';
select
can be used to change default output filehandle.$| = 1;
unbuffers currently selected output.
User Subroutines
- Subroutines are in their own namespace and are typically called using
&mysub
(sometimes the ampersand can be omitted). - Subroutines can be defined anywhere e.g.
sub mysub { my($m, $n) = @_ $m + $n; }
- The value of the last expression evaluated is returned. But
return $a;
can be used to return immediately. - A list can be returned if the subroutine is called in a list context (
wantarray
can be used to detect list or scalar context). - Arguments are in the list @_ . If
&mysub;
is used, the parent's argument list is inherited. - Lexical scoped variables can be used in any block by prefixing with
my
. - In Perl 5.10 and up
state $x;
can be used to declare a private variable that keep state between calls. - Ampersand is optional if subroutine has previously been declared or if parenthesis are used. Ampersand is not optional if built-in subroutine is being overridden.