Pragmatic Functional Programming in Perl

G. Wade Johnson

Houston.pm

What is Functional Programming?

Functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data.

A programming paradigm is an approach to thinking about and organizing problems and solutions.

I promise there won't be too many other definitions in this talk.

Major Principles of Functional Programming

Immutable data is not really a requirement, but most functional langauges do tend to use that approach.

In this case, pure means that the output of the function only depends on it's inputs. People also tend to add the constraint that they have no side effects.

In this case, I don't mean that functions are objects in the OOP sense. I just mean that functions can be used in much the same way as integers, floats, and strings.

Higher order functions are functions that apply to functions. They can either take functions as parameters, return functions, or both.

Immutable Data

Perl doesn't directly support immutable data, or even constants really. There are a few modules that retrofit constant or read-only semantics onto Perl.

map/grep/sort

Although it is possible to modify the original array through map and grep, following the principle of immutable data, we should not.

Code using map and grep is easier to reason about if you don't mutate the input.

Examples


  my @minutes = map { $_ / 60 } @times;

  my @odd = grep { $_ % 2 == 1 } @values;

  my @sorted = sort { $a->{name} cmp $b->{name} } @unsorted;

Some really simple uses of these builtins. It's often handy to think of this as modifying the list instead of thinking about the individual elements. This helps the idioms to become more familiar. This is similar to thinking of 4 * 5 as four times five rather than four added to itself five times.

Data Pipelines

Since most of these operations take a list as an argument and return a list, it's easy to chain them to perform more complicated operations. The only downside is you have to read the chain from right to left.

Schwartzian Transform

Take an expensive sort, transform into a list that's quicker to sort, then transform back into the sorted list.


  my @sorted = map  { $_->[1] }
               sort { $a->[0] <=> $b->[0] }
               map  { [ expensive_operation( $_ ), $_ ] }
               @unsorted;

If you've worked in Perl for long, you've probably run across the Schwartzian Transform. The speed of this comes from reducing the number of calls to the expensive functions. You can find a sort_by function in List::MoreUtils to wraps up some of this.

Functions as first-class objects

In languages that support functional programming, functions can be used in similar ways to other forms of data like integers and strings.

Perl code object examples


  sub foo { return 'string'; }

  my $proc = \&foo;

  my $proc1 = sub { return 'string'; }

  my $bar = 0;
  my $counter = sub { return ++$bar; }
  $counter->();

Here are some examples of subs used in the same way as other data. It's important to note that $proc and $proc1 do not contain 'string' instead they contain references to subs that will return that string.

Higher order functions

Obviously, if you can put a function in a variable, you should be able to pass it as a parameter to another subroutine, or return it.

Example: Functions in Variables


  sub hello
  {
    my ($target) = @_;
    print "Hello, $target\n";
  }

  my $proc = \&hello;

  hello('world');
  $proc->('world');

In this example, $proc contains a reference to the hello subroutine. In the last line, we execute that subroutine through the reference. This is exactly the same as executing it directly.

Example: Functions as Arguments


  sub second (&@)
  {
      my $pred = shift;
      my $matches = 0;
      foreach (@_)
      {
          if($pred->())
          {
              return $_ if $matches;
              ++$matches;
          }
      }
      return;
  }

The signature allow us to leave off the sub keyword when passing an anonymous sub as an argument. This sub expects a coderef that will be used as a predicate (it should return true or false) and a list of values. It will return the second item from the list that causes the predicate to evaluate to true. Otherwise it returns undef.

Example: Returning Functions


  sub d6
  {
      return sub { return 1 + int( rand 6 ); };
  }

  sub counter_by
  {
     my ($inc) = @_;
     $inc ||= 1;
     my $count = 0;
     return sub { return $count += $inc; }
  }

These are examples of subroutines that return subroutines. In both cases, we create an anonymous sub that is returned. The first example returns an identical sub each time. The second has the special property that it remembers the increment that you supplied to it and a counter that it will increment each time it is called. This memory of variables defined outside the function is what makes the anonymous function a closure.

Example: Currying


  sub bind_first
  {
      my ($proc, $arg) = @_;
      return sub { return $proc->( $arg, @_ ); }
  }

The term currying refers to a technique where we take a function with one or more parameters and generate a new function that is the original function with one of its parameters bound to a particular value.

Honestly, I don't use this technique very often. I did use it quite a bit in C++ when using the STL (when it was still called that).

Case Study: Process a Log File

Look at the code in the examples/log directory for some examples of code that does some log processing with different levels of functional design.

No need to go all the way

You can use functional techniques embedded in your standard Perl.


  foreach my $k (sort keys %hash)
  { ... }

  foreach my $el (sort { $hash{$b} <=> $hash{$a} } keys %hash)
  { ... }

  foreach my $el (grep { defined $_ } @input)
  { ... }

  my $num_capped = grep { /\A[A-Z]/ } @input;

Since Perl is not just a functional programming language, we are free to use as much or as little of the functional paradigm as needed to solve our problems. I often find it useful to do a small amount of list processing before doing the bulk of the work in a more procedural style.

File::Find


   my $TEN_MEG = 10 * 1024 * 1024;
   my @large_files;
   sub save_large
   {
      push @large_files, $_ if -s $_ > $TEN_MEG;
   }
   File::Find::find({ wanted => \&save_large,
                      no_chdir => 1
                    },
                    '.');

The find function expects to take a function that will be executed on every file and directory that is found while traversing a directory structure.

Dispatch Table


  my %commands = (
    '+'     => sub { $_[0] + $_[1] },
    '-'     => sub { $_[0] - $_[1] },
    '*'     => sub { $_[0] * $_[1] },
    '/'     => sub { $_[0] / $_[1] },
    'print' => sub { print $_[0] },
  );

This is one of my favorite techniques. Very useful for converting a DSL or commannd line arguments into actual code.