Perl Unit Testing: Techniques
G. Wade Johnson
cPanel, Inc.
notes
Unit Testing Techniques
- Testable Code
- Global State
- Libraries for Testing
- Making Code Testable
- Testing Patterns
- Testing Anti-Patterns
- Coverage
These are the topics I plan to cover today. This won't cover everything you
need to know about unit testing in Perl, but it will give you more than the
intro talk.
What is Testable Code?
Testable code vs. Code that can be tested.
What do we mean when we say code is testable?
Attributes of Testable Code
- Small, focused functions
- No (minimal) side effects
- No dependency on global state
- Well-defined inputs
- Well-defined outputs
All of these attributes make the code easier to test. It's possible to
perform multiple tests, changing a single input, and get a defined output.
You may be able to test other code, but it will not be easy.
Reality Check
- Multi-thousand line functions
- ... that print to the screen and change disk
- ... and depend on a database and environment variables
- ... that go find their input directly, instead of using arguments
- ... and write their output to various locations without returning anything.
Global State
Possibly the biggest problem making code hard to test.
- for input
- for output
- for both
Global state may be affected by things outside the scope of your
test. This makes repeatable, consistent tests quite difficult.
Global Input
- Singleton objects
- Databases
- File system
- Environment variables
- Current directory
- Other processes
- Humans
If you code depends on any of these items, it will be harder to test.
Global Output
- Singleton objects
- Databases
- File system
- Current directory
- Other processes
- Humans
- Printer
If you code changes any of these items, it will be harder to test.
Side Effects
Code that affects global state has side effects.
notes
Libraries for Testing
Test::NoWarnings
Test::Output
- Company-specific libraries
Test::Output
Example
use Test::Output;
stdout_is { print "Hello World"; } 'Hello World';
stdout_like { print 'Hello Wade'; } qw/Wade/;
stderr_is { print STDERR 'Hello'; } 'Hello';
Finding Libraries
use lib
is your friend
use lib "t/lib";
use FindBin;
use lib "$FindBin::Bin/..";
use lib "$FindBin::Bin/lib";
Making Code Testable
- Characterize the code with tests
- Factor out coherent pieces of functionality
- Unit test the new functions
- Change global information into parameters where possible
- Extend tests
- Repeat until done
Testing Patterns
- Dependency Injection
- Mocking
- Edge Cases/Boundary Conditions
- Data-Driven Tests
- Error-Handling Testing
- Fuzz Testing
The first two are mostly useful for reducing global dependencies and
removing coupling between independent systems. The third through fifth are
ways to think about generating new tests. The final is magic.
Dependency Injection
Provide global dependency as a parameter.
... Use defaulting for the common case
sub foo {
my ($parm, $gobj) = (@_);
$gobj ||= $global_object;
...
}
Dependency Injection, Usage
Live code
foo( $parm );
Test code
foo( $parm, $fake_global_object );
notes
Dependency Injection, Part Two
Wrap retrieving the global data in a subroutine.
... In the test code, override that subroutine.
package Foo;
sub _get_global_object {
return $global_object;
}
sub bar {
my ($parm) = (@_);
my $gobj = _get_global_object();
...
}
Dependency Injection 2, Usage
Live code
Foo::bar( $parm );
Test code
{
package Foo;
no warnings 'redefine';
sub _get_global_object { return $fake_global_object; }
}
Foo::bar( $parm );
notes
What is Mocking?
- Not the same as heckling
- Replace implementation of a module needed by the code
- Provides the same interface
- Minimal functionality
- No damaging side effects
notes
Why Mock?
- Separation of code under test from dependencies
- Module has effects that are difficult to reverse
- Difficult to generate some responses from the module
- Usually a sign that the code is not well-factored
notes
Edge Cases/Boundary Conditions
Bugs lurk in corners and congregate at boundaries.
— Boris Bezier
Most of the inputs of a function are pretty much the same. Boundaries are
where behaviour of the function changes. Concentrate in those areas more.
Potential Boundaries
- For scalars:
undef
- For numbers: 0, -1, 1, max int, min int
- For ranged numbers: largest num, smallest num
- For strings: empty, single character, "\0"
- For limited strings: longest string, larger than longest string
- For hashes: empty hash, missing keys, extra keys
- For lists: empty list, longer than expected,
undef
elements
notes
Data-Driven Tests
See example
Error-Handling Testing
- It is important to test error handling
- It is important to test validation
- Error checking code tends to be the least tested
- Latent bugs in error checking code can be insidious, because it should not get run
notes
Fuzz Testing
Random inputs in the hopes of triggering unusual error conditions.
Often used to attack code.
Testing Anti-Patterns
- Testing mode (flag)
- Saving data during run for test code
- Don't test the hard stuff
notes
Test Mode
- Unit tests don't actually test production code
- Flags are almost never a good idea
- Added conditions slow production code
notes
Saving Data for Test Code
- Saved data may not match actual functionality
- Added code slows production code
- Adds a global side effect
notes
Conclusion
There's a lot more to good testing than just the assertions.
notes
Coverage
How do you know how well you have tested your code?
notes
Levels of Coverage
- Subroutine coverage
- Statement coverage
- Branch coverage
- Condition coverage
- Path coverage
Statement vs. Branch Coverage
if( get_value() > 0 ) {
do_it();
}
do_other();
Branch vs. Condition Coverage
if( defined $var && $var > 0 ) {
do_it();
}
else {
do_other();
}
Devel::Cover
CPAN module that instruments code to determine what parts of it have been exercised.
Devel::Cover
Output
See example
100% Coverage
- Any code not covered may not have been executed.
- Some conditions may be difficult to duplicate.
- Law of Diminishing Returns
- 100% coverage is not the same as exhaustively tested.
Don't Trust 100% Coverage
100% coverage is necessary for complete testing, but it may not be sufficient.
Intelligent Testing vs. Code Coverage
Think about what needs to be tested rather than try to hit every line/branch.
Conclusion
- Full statement coverage is good, but full branch coverage is better.
- Full path coverage is probably not possible.
- While 80% coverage is better than 20% coverage, 100% may not be better than 80%.
- Coverage is an indicator, not a goal.