Perl List Operations

G. Wade Johnson

Lists and Arrays

This is much like the distinction between 5 and a scalar that holds the value. Although it seems like there is no real difference Perl does treat the two differently in a few cases.

Array Slices


    my @foo  = @array[ 1, 2, 3, 4 ];
    my @bar  = @array[ 1..4 ];
    my @baz  = @array[ 1, 0, 2, -1, -2 ];
    my @quux = @array[ 1, 2, 1, 1, 3, 0 ];

The array indices do not have to be in increasing order and can repeat. Negative indexes are allowed and count from the right just like normal indexes.

Array Operators

These operators add and remove items from an array. The splice operator is pretty general purpose and powerful enough that you will want to read the perldoc page and experiment with it to understand.

All of these are array operators because they modify an array that is passed as an argument. A list won't work because it does not have anything to modify. Calling push on a list is like trying to execute 5 += 3;.

List Operators

These operators apply to lists. We won't spend much time on join, split, and reverse. The operation of these operators is mostly self-explanatory.

Most people find that sort is necessary to learn, because they have no explicit way of doing it by hand. So, they accept the need and learn it. The map and grep operators are the ones that many people ignore because they can hand-roll loops to do mostly the same thing.

split and the m// operator can be used to generate a list out of a string. join combines a list with a separator to create a string.

List From Scalar


    my @fields = split /,/, $line;
    my @identifiers = ($line =~ m/(\w+)/g);

If you know the delimiters, choose split. If you know what you want to keep, use m//

map: Transform


    my @mins = map { int( ($_ / 60) + 0.5 ) } @seconds;
    my @dts  = map { DateTime->from_epoch( epoch => $_ ) }
              @times;

Convert a list of one kind of thing into a list of a different kind of thing. More useful if you think of the list being transformed rather than the elements. Make a DateTime array out of an epoch times array.

"map is Too Advanced"


    my $val = 0;
    foreach my $i ( 1 .. $y )
    {
        $val += $x;
    }

I have often heard people say that map is too advanced, so we shouldn't use it. This is sort of like saying multiply is too advanced. After all we can always do integer multiplication with repeated adds. That's clearer for someone who has never learned to multiply.

grep: Filter


    my @long = grep { length > 30 } @names;
    my @old  = grep { $_ < time } @times;
    die "No files found.\n"
        unless grep { -f $_ } @files;

sort


    my @foo = sort @names;
    my @bar = sort { length $a <=> length $b } @names;
    my @baz = sort numerically @counts;
notes

Schwartzian Transform


   my @sorted = map  { $_->[1] }
                sort { $a->[0] cmp $b->[0] }
                map  { [ expensive_key( $_ ), $_ ] }
                @unsorted;

If the key that we use for the comparison is expensive to compute, we want to avoid computing it more than necessary. The sort does O( n log n ) comparisons, touching two keys each time. For 1024 elements, the naive approach is 20480 (1024*10*2) calls to the key calculation function.

The Schwartzian Transform reduces this to 1024 calls. There's a little more cost for making the pairs and undoing at the end. But, the overall result is much faster for reasonably large data sets.

Module List::Util

All of the transform functions can be implemented with reduce. The reduce operation generates a scalar out of a list.

All of the logical operations and first can be implemented (sometimes inefficiently) with grep. The other search methods provide solutions that you could easily make with a foreach loop, but these are more efficient and clear.

None of these is particularly difficult to implement by hand, but using these functions results in clearer code that you can read instead of parsing.

List::Util Example 1

Let's look at an example of checking for required parameters.

Open examples/missing_benchmark.pl in editor.

Although you might get excited about the factor of two difference in speed. Look at the rate. Will it matter?

List::Util Example 2

Let's look at an example of finding the maximum string in a set.

Open examples/maxstr_benchmark.pl in editor.

If you didn't have the example of the short form, how much reading would you need to do to recognize what it does. More importantly, did you notice the inefficiency?

Module List::MoreUtils

Originally, List::MoreUtils covered utility functions that had been left out of List::Util. Most of the logical operators have now been added, as well as first, which is equivalent to firstval.

Module List::MoreUtils, continued

The sort_by and nsort_by operators remove much of the need for the Schwartzian transform, at least for obvious cases. Some benchmarking shows ST may still be faster in some circumstances.

List::MoreUtils Example 1

Let's look at an example of sorting by a key.

Open examples/sort_by_benchmark.pl in editor.

Summary

notes