====== Perl Cheat Sheet ======
[[http://www.perl.org/ | Perl]] is a highly capable, feature-rich programming language with over 25 years of development. These notes follow the path of the [[http://www.amazon.com/Learning-Perl-Randal-L-Schwartz/dp/1449303587 | Learning Perl]] book.
* First line should be #!/usr/bin/perl or #!/usr/bin/env perl
* Perl compiles to byte-code before executing
* Do not use an extension for scripts.
* To specify a minimum version use "use 5.010;"
* Semicolons (;) required to separate statements
* "perl -w" turns on warnings for entire program. For just the current file use "use warnings;"
* "use diagnostics;" adds explanations to warnings
* "use strict;" requires variable declaration
* Comments start with # (no block comments)
* Parenthesis are optional unless part of syntax.
* ''@ARGV'' contains arguments, $0 contains program name.
* ''die''/''warn'' can be used to exit/warn, $! will contain any system error message. Without \n at the end, perl will append line number to error message.
===== Numbers =====
* All numbers are stored in the same format e.g.-6.5e24, 037 (octal), 0x2a13b (hex), 0b11010 (binary)
* Underscores for readability are ignored e.g. 0x1377_0B77
* Arithmetic operators are * / + - % (modulus) %%**%% (exponentiation)
* Comparison operators are < <= == >= > !=
===== Strings =====
* To use unicode in source code - "use utf8;"
* Single-quoted literals '⅚∞☃☠' can contain any character except single-quote (use \') and backslash (use \\).
* Double-quoted strings can contain control-characters starting with a backslash and are variable interpolated e/g/ "${var}s $var\n".
* . is for concatenation, x for string repetition.
* To insert a character by code - chr(0x05d0). To convert a character to its code - ord('א')
* String comparison operators are lt, le, eq, ge, gt, and ne
===== Scalar Variables =====
* Scalars are numbers or strings (or undefined or references), used interchangeably.
* $calar variables start with $ and a letter or underscore, but can contain letters (including unicode), numbers or underscores.
* Scalar assignment looks like $a = 3; or $a .= "suffix"; or $a %%**%%= 3;
* Recommended to use all lowercase variable names with underscores for word separation e.g. $var_name
* print takes a single scalar or a list of scalars (separated by commas)
* When a scalar is interpreted as a boolean, the following are //false// : 0, '0', %%''%%, undefined .
* !! is a handy shortcut to convert any scalar to 0 or 1 (to represent a boolean)
* Variables have value ''undef'' before being assigned. ''undef'' acts as a 0 or "" as needed, but will throw a warning if printed.
* ''defined()'' checks if a variable has been defined. Can also set a variable to ''undef''.
===== Lists and Arrays =====
* A list is an ordered set of elements (each of which is a scalar value), starting with position 0.
* @rrays are variables that store a list (separate namespace from scalar variables). @x refers to entire array x.
* Undefined arrays start out as ''()'', the empty list and not ''undef''.
* To access an element of an array - ''$myarr[0]'' - ''undef'' if never set.
* Index of last element is in ''$#arr''
* Negative array indices wrap around (only once) so -1 refers to last element.
* Literal lists - ''('abc','def')'' or ''(1..6)'' (integers 1 to 6) or ''qw(abc def)'' or ''qw#abc def#'' (quoted by whitespace).
* Can assign list values to variables e.g. ($fred, $barney, $dino) = ("flintstone", "rubble", undef);
($fred, $barney) = ($barney, $fred); # swap those values
* ''push'' adds element(s) to **end** of array. ''pop'' removes a single element and returns it.
* ''unshift'' adds element(s) to **start** of array. ''shift'' removes a single element and returns it.
* ''splice'' removes elements from middle of array, returns them and optionally replaces them splice @arr, start, len, @newelems
* ''sort'' and ''reverse'' functions return the modified list (can be saved to original array variable).
* Expressions parsed in either a scalar context or a list context. Scalars are promoted to single-element lists in list context.
* List functions may return different scalars - array variables return number of elements. The ''scalar'' function forces a scalar context e.g. for print function.
===== Hashes =====
* A hash is a list indexed by a string (key) - a collection of key-value pairs. ''%hash'' has its own namespace.
* Uses scalable, efficient algorithms. Used to be called associative arrays.
* ''$hash{$some_key}'' to access a value (curly-braces instead of square braces).
* Assigning a hash to an array unwinds (flattens) it.
* To initialize a hash :%some_hash = ('a', 1, 'b', 2, 'c', 3);
%some_hash = ( 'a' => 1, 'b'=>2, 'c'=>3,);
* When using a big arrow (a fat-comma) or when accessing a value, simple keys don't have to be quoted (barewords) e.g. %some_hash = ( a => 1}; $some_hash{a};
* ''%revhash = reverse %hash'' to reverse a hash (for non-unique values, last one wins).
* ''keys %hash;'' and ''values %hash;'' return a list of keys or value in same order (or # of keys/values in scalar context).
* '' %hash'' is true only if hash has at least one key-value pair.
* To iterate over hash : while ( ($key, $value) = each %hash ) {
print "$key => $value\n"; }
or in order of keys foreach $key (sort keys %hash) {
print "$key => $hash{$key}\n"; }
* %ENV hash holds environment variables.
===== Control Structures =====
* if-elsif-else if ($a == $b) {
a = 0;
} elsif ($a > $b) {
a = 1;
} else {
a = -1;
}
* while loop $count=0;
while ($count < 10) {
$count += 2;
}
* foreach loop foreach my $rock (@rocks) {
# modifications to $rock modify the list element
# $_ is used if loop variable is omitted
}
===== Input/Output =====
* Read a line of input - ''$line = ;''
* ''chomp'' removes a newline e.g. ''chomp($line=);''
* Assigning to a list reads all input up till EOF e.g. chomp(@lines = );
* Loop over input ($_ can only be used in this specific case) while () {
print "I saw $_";
}
* ''foreach'' can also be used but uses more memory since it reads all input into memory first.
* ''<>'' iterates over all files in @ARGV (or STDIN if no args) like e.g. cat/sed/awk. while (<>) {
chomp;
print LOGFILE "It was $_ that I saw!\n";
}
* ''print'' takes list of items and sends all to STDOUT (unseparated). ''print @array;'' vs ''print "@array";'' print <>; # source code for 'cat'
print sort <>; # source code for 'sort'
* C-like printf function %g for number auto-format,%10s, %-10d etc.my @items = qw( wilma dino pebbles );
my $format = "The items are:\n" . ("%10s\n" x @items);
printf $format, @items;
printf "The items are:\n".("%10s\n" x @items), @items;
* Filehandles can be barewords (upper-cased) or variables. Special filehandles are : STDIN, STDOUT, STDERR, DATA, ARGV, and ARGVOUT .open CONFIG, 'fred' || die "Cannot open fred: $!";
open LOG, '>>:encoding(UTF-8)','logfile'; # for perl >= 5.6
open my $bedrock, '>:crlf', $file_name; # DOS-formatted output
binmode STDOUT, ':encoding(UTF-8)';
* ''select'' can be used to change default output filehandle. ''$| = 1;'' unbuffers currently selected output.
===== User Subroutines =====
* Subroutines are in their own namespace and are typically called using ''&mysub'' (sometimes the ampersand can be omitted).
* Subroutines can be defined anywhere e.g. sub mysub {
my($m, $n) = @_
$m + $n;
}
* The value of the last expression evaluated is returned. But ''return $a;'' can be used to return immediately.
* A list can be returned if the subroutine is called in a list context (''wantarray'' can be used to detect list or scalar context).
* Arguments are in the list @_ . If ''&mysub;'' is used, the parent's argument list is inherited.
* Lexical scoped variables can be used in any block by prefixing with ''my''.
* In Perl 5.10 and up ''state $x;'' can be used to declare a private variable that keep state between calls.
* Ampersand is optional if subroutine has previously been declared or if parenthesis are used. Ampersand is not optional if built-in subroutine is being overridden.