====== Perl Cheat Sheet ====== [[http://www.perl.org/ | Perl]] is a highly capable, feature-rich programming language with over 25 years of development. These notes follow the path of the [[http://www.amazon.com/Learning-Perl-Randal-L-Schwartz/dp/1449303587 | Learning Perl]] book. * First line should be #!/usr/bin/perl or #!/usr/bin/env perl * Perl compiles to byte-code before executing * Do not use an extension for scripts. * To specify a minimum version use "use 5.010;" * Semicolons (;) required to separate statements * "perl -w" turns on warnings for entire program. For just the current file use "use warnings;" * "use diagnostics;" adds explanations to warnings * "use strict;" requires variable declaration * Comments start with # (no block comments) * Parenthesis are optional unless part of syntax. * ''@ARGV'' contains arguments, $0 contains program name. * ''die''/''warn'' can be used to exit/warn, $! will contain any system error message. Without \n at the end, perl will append line number to error message. ===== Numbers ===== * All numbers are stored in the same format e.g.-6.5e24, 037 (octal), 0x2a13b (hex), 0b11010 (binary) * Underscores for readability are ignored e.g. 0x1377_0B77 * Arithmetic operators are * / + - % (modulus) %%**%% (exponentiation) * Comparison operators are < <= == >= > != ===== Strings ===== * To use unicode in source code - "use utf8;" * Single-quoted literals '⅚∞☃☠' can contain any character except single-quote (use \') and backslash (use \\). * Double-quoted strings can contain control-characters starting with a backslash and are variable interpolated e/g/ "${var}s $var\n". * . is for concatenation, x for string repetition. * To insert a character by code - chr(0x05d0). To convert a character to its code - ord('א') * String comparison operators are lt, le, eq, ge, gt, and ne ===== Scalar Variables ===== * Scalars are numbers or strings (or undefined or references), used interchangeably. * $calar variables start with $ and a letter or underscore, but can contain letters (including unicode), numbers or underscores. * Scalar assignment looks like $a = 3; or $a .= "suffix"; or $a %%**%%= 3; * Recommended to use all lowercase variable names with underscores for word separation e.g. $var_name * print takes a single scalar or a list of scalars (separated by commas) * When a scalar is interpreted as a boolean, the following are //false// : 0, '0', %%''%%, undefined . * !! is a handy shortcut to convert any scalar to 0 or 1 (to represent a boolean) * Variables have value ''undef'' before being assigned. ''undef'' acts as a 0 or "" as needed, but will throw a warning if printed. * ''defined()'' checks if a variable has been defined. Can also set a variable to ''undef''. ===== Lists and Arrays ===== * A list is an ordered set of elements (each of which is a scalar value), starting with position 0. * @rrays are variables that store a list (separate namespace from scalar variables). @x refers to entire array x. * Undefined arrays start out as ''()'', the empty list and not ''undef''. * To access an element of an array - ''$myarr[0]'' - ''undef'' if never set. * Index of last element is in ''$#arr'' * Negative array indices wrap around (only once) so -1 refers to last element. * Literal lists - ''('abc','def')'' or ''(1..6)'' (integers 1 to 6) or ''qw(abc def)'' or ''qw#abc def#'' (quoted by whitespace). * Can assign list values to variables e.g. ($fred, $barney, $dino) = ("flintstone", "rubble", undef); ($fred, $barney) = ($barney, $fred); # swap those values * ''push'' adds element(s) to **end** of array. ''pop'' removes a single element and returns it. * ''unshift'' adds element(s) to **start** of array. ''shift'' removes a single element and returns it. * ''splice'' removes elements from middle of array, returns them and optionally replaces them splice @arr, start, len, @newelems * ''sort'' and ''reverse'' functions return the modified list (can be saved to original array variable). * Expressions parsed in either a scalar context or a list context. Scalars are promoted to single-element lists in list context. * List functions may return different scalars - array variables return number of elements. The ''scalar'' function forces a scalar context e.g. for print function. ===== Hashes ===== * A hash is a list indexed by a string (key) - a collection of key-value pairs. ''%hash'' has its own namespace. * Uses scalable, efficient algorithms. Used to be called associative arrays. * ''$hash{$some_key}'' to access a value (curly-braces instead of square braces). * Assigning a hash to an array unwinds (flattens) it. * To initialize a hash :%some_hash = ('a', 1, 'b', 2, 'c', 3); %some_hash = ( 'a' => 1, 'b'=>2, 'c'=>3,); * When using a big arrow (a fat-comma) or when accessing a value, simple keys don't have to be quoted (barewords) e.g. %some_hash = ( a => 1}; $some_hash{a}; * ''%revhash = reverse %hash'' to reverse a hash (for non-unique values, last one wins). * ''keys %hash;'' and ''values %hash;'' return a list of keys or value in same order (or # of keys/values in scalar context). * '' %hash'' is true only if hash has at least one key-value pair. * To iterate over hash : while ( ($key, $value) = each %hash ) { print "$key => $value\n"; } or in order of keys foreach $key (sort keys %hash) { print "$key => $hash{$key}\n"; } * %ENV hash holds environment variables. ===== Control Structures ===== * if-elsif-else if ($a == $b) { a = 0; } elsif ($a > $b) { a = 1; } else { a = -1; } * while loop $count=0; while ($count < 10) { $count += 2; } * foreach loop foreach my $rock (@rocks) { # modifications to $rock modify the list element # $_ is used if loop variable is omitted } ===== Input/Output ===== * Read a line of input - ''$line = ;'' * ''chomp'' removes a newline e.g. ''chomp($line=);'' * Assigning to a list reads all input up till EOF e.g. chomp(@lines = ); * Loop over input ($_ can only be used in this specific case) while () { print "I saw $_"; } * ''foreach'' can also be used but uses more memory since it reads all input into memory first. * ''<>'' iterates over all files in @ARGV (or STDIN if no args) like e.g. cat/sed/awk. while (<>) { chomp; print LOGFILE "It was $_ that I saw!\n"; } * ''print'' takes list of items and sends all to STDOUT (unseparated). ''print @array;'' vs ''print "@array";'' print <>; # source code for 'cat' print sort <>; # source code for 'sort' * C-like printf function %g for number auto-format,%10s, %-10d etc.my @items = qw( wilma dino pebbles ); my $format = "The items are:\n" . ("%10s\n" x @items); printf $format, @items; printf "The items are:\n".("%10s\n" x @items), @items; * Filehandles can be barewords (upper-cased) or variables. Special filehandles are : STDIN, STDOUT, STDERR, DATA, ARGV, and ARGVOUT .open CONFIG, 'fred' || die "Cannot open fred: $!"; open LOG, '>>:encoding(UTF-8)','logfile'; # for perl >= 5.6 open my $bedrock, '>:crlf', $file_name; # DOS-formatted output binmode STDOUT, ':encoding(UTF-8)'; * ''select'' can be used to change default output filehandle. ''$| = 1;'' unbuffers currently selected output. ===== User Subroutines ===== * Subroutines are in their own namespace and are typically called using ''&mysub'' (sometimes the ampersand can be omitted). * Subroutines can be defined anywhere e.g. sub mysub { my($m, $n) = @_ $m + $n; } * The value of the last expression evaluated is returned. But ''return $a;'' can be used to return immediately. * A list can be returned if the subroutine is called in a list context (''wantarray'' can be used to detect list or scalar context). * Arguments are in the list @_ . If ''&mysub;'' is used, the parent's argument list is inherited. * Lexical scoped variables can be used in any block by prefixing with ''my''. * In Perl 5.10 and up ''state $x;'' can be used to declare a private variable that keep state between calls. * Ampersand is optional if subroutine has previously been declared or if parenthesis are used. Ampersand is not optional if built-in subroutine is being overridden.