Tormenting Your Tests with Heckle

Posted by kev Tue, 19 Dec 2006 09:24:00 GMT

Update: Ruby2Ruby is having gem propogation issues. Feel free to download the gem here directly and install via gem install ruby2ruby-1.1.2.gem.

Update 2: We’ve found a bug in the loading that causes problems when you supply a method to Heckle. A bug fix has been checked into the repo and we’re preparing a release. Look for 1.1.1 soonish.

Update 3: Ok, 1.1.1 is out the door. The gem server is syncing, so look for a new version this afternoon (12/20) with several bugs including the loading error fixed.

Yes, I know what you’re thinking. “Holy crap, Kevin posted for the first time in months! I thought he died, or got eaten by a corporate zombie, or set out on a epic adventure to find himself.” But hey, good things come to those who wait, right?

So, you’ve been waiting, and I’ve been writing Heckle. It’s a good thing.

Heckle is a mutation tester. It modifies your code and runs your tests to make sure they fail. The idea is that if code can be changed and your tests don’t notice, either that code isn’t being covered or it doesn’t do anything.

It’s a little weird, I know, but I like to think about it as pen-testing. It’s like hiring a white-hat hacker to try to break into your server and making sure you detect it. You learn the most by trying to break things and watching the outcome.

Anyway, Heckle was inspired by Jester, and Ryan Davis wrote a proof of concept at RubyConf. As he notes, I went a little nuts and much of the current implementation I rewrote that night or on the plane home.

You can install Heckle from Ruby Gems:

  gem install heckle --include-dependencies 

Let’s take the new toy out for a test drive.

Saying Hello to Branch Coverage

Sometimes line based code coverage tools can’t catch gaps. For example, let’s say we’re working on some simple greeter system. Our initial code and tests look like this:

  class Greeter
    def initialize(person)
      @person = person
    end

    def greet
      "Hi #{@person}!"
    end
  end
  require "test/unit"

  class TestGreeter < Test::Unit::TestCase
    def test_greet
      @greeter = Greeter.new('Kevin')
      assert_equal 'Hi Kevin!', @greeter.greet
    end
  end

Tests pass, and for this trivial example, coverage seems to be there. Running rcov confirms that every line in the Greeter class is being executed. But what happens when we decide to make the person attribute optional?

  class Greeter
    def initialize(person = nil)
      @person = person
    end

    def greet
      @person.nil? ? "Hi there!" : "Hi #{@person}!"
    end
  end

With this implementation, tests still pass and rcov still reports 100% coverage. Still, we know that a branch in that if isn’t being tested. Enter Heckle.

First let’s take a look at what Heckle tells us about these tests, and then we can go over how it does it. Usage information for Heckle is rather simple:

  odysseus:~/code/heckle_demo kev$ heckle
  Usage: heckle class_name [method_name]
      -v, --verbose                    Loudly explain heckle run
      -t, --tests TEST_PATTERN         Location of tests (glob)
      -h, --help                       Show this message

A simple run looks like this:

  odysseus:~/code/heckle_demo kev$ heckle Greeter
  Initial tests pass. Let's rumble.

  **********************************************************************
  ***  Greeter#greet loaded with 3 possible mutations
  **********************************************************************

  3 mutations remaining...
  2 mutations remaining...
  1 mutations remaining...

  The following mutations didn't cause test failures:

  def greet
    if @person.nil? then
      "z#\010]\021\r\e3&TX\001z+\021fOy\016N6\t%F\acu\027\023w\024;}3Vcs>\035\017<Nc]ra\023V0\005 3UB\031]97rN1L\017\020TVJ\t\003k!l;\fA\036?[{lj;}ir2fPNaI\020\020w6$\eR*"
    else
      "Hi #{@person}!"
    end
  end

Heckle replaced the string, “Hi there!” with a bunch of random characters but the tests still passed. The situation where @person is nil was never tested. If we add a new test then Heckle should quiet down:

  def test_greet_nobody
    @greeter = Greeter.new
    assert_equal 'Hi there!', @greeter.greet
  end
  odysseus:~/code/heckle_demo kev$ heckle Greeter
  Initial tests pass. Let's rumble.

  **********************************************************************
  ***  Greeter#greet loaded with 3 possible mutations
  **********************************************************************

  3 mutations remaining...
  2 mutations remaining...
  1 mutations remaining...
  No mutants survived. Cool!

Wait.. What? How’d it do that?

Heckle works by using the ParseTree and RubyToRuby libraries to grab the abstract syntax tree of methods, modify them, and evaluate the redefined method before running your tests. It can do all of this atomically, so each change can be seen individually. If you’d like to watch the action take place, you can supply the -v option. That last test run looks like this in verbose mode:

  odysseus:~/code/heckle_demo kev$ heckle -v Greeter
  Loaded suite /usr/local/bin/heckle
  Started
  ..
  Finished in 0.000447 seconds.

  2 tests, 2 assertions, 0 failures, 0 errors
  Initial tests pass. Let's rumble.

  **********************************************************************
  ***  Greeter#greet loaded with 3 possible mutations
  **********************************************************************

  3 mutations remaining...
  Replacing Greeter#greet with:

  def greet
    if @person.nil? then
      "uO i\032X#mcV"
    else
      "Hi #{@person}!"
    end
  end
  Loaded suite /usr/local/bin/heckle
  Started
  .F
  Finished in 0.00812000000000002 seconds.

    1) Failure:
  test_greet_nobody(TestGreeter) [./test/test_greeter.rb:13]:
  <"Hi there!"> expected but was
  <"uO i\032X#mcV">.

  2 tests, 2 assertions, 1 failures, 0 errors
  Tests failed -- this is good
  2 mutations remaining...
  Replacing Greeter#greet with:

  def greet
    if @person.nil? then
      "Hi there!"
    else
      "Hi #{@person}\0204\026\036]7D\020#wC\010&=-\004\017\t7.x\036\ap07hqO\f^\025\003+P\016]<0M\vV`lbU\e"
    end
  end
  Loaded suite /usr/local/bin/heckle
  Started
  F.
  Finished in 0.001194 seconds.

    1) Failure:
  test_greet(TestGreeter) [./test/test_greeter.rb:8]:
  <"Hi Kevin!"> expected but was
  <"Hi Kevin\0204\026\036]7D\020#wC\010&=-\004\017\t7.x\036\ap07hqO\f^\025\003+P\016]<0M\vV`lbU\e">.

  2 tests, 2 assertions, 1 failures, 0 errors
  Tests failed -- this is good
  1 mutations remaining...
  Replacing Greeter#greet with:

  def greet
    if @person.nil? then
      "Hi #{@person}!"
    else
      "Hi there!"
    end
  end
  Loaded suite /usr/local/bin/heckle
  Started
  FF
  Finished in 0.001984 seconds.

    1) Failure:
  test_greet(TestGreeter) [./test/test_greeter.rb:8]:
  <"Hi Kevin!"> expected but was
  <"Hi there!">.

    2) Failure:
  test_greet_nobody(TestGreeter) [./test/test_greeter.rb:13]:
  <"Hi there!"> expected but was
  <"Hi !">.

  2 tests, 2 assertions, 2 failures, 0 errors
  Tests failed -- this is good
  No mutants survived. Cool!

FAQ

So what can Heckle.. um.. heckle?

In version 1.1, Heckle will create random replacements for: Strings, Regexps, Symbols, Ranges, and the Numeric types (Fixnum, Float, Bignum). It will flip true to false and vice versa. It will also flip the branches on if and unless statements, as well as until and while statements.

I used Jester and it was really slow. How’s Heckle?

Really very fast. There’s no compile step for Heckle (as there is when you modify Java code with Jester), so the bottleneck is usually your tests. Fast tests mean fast heckling.

What other options can Heckle take?

The other significant option heckle takes is --tests. This flag is used to give a pattern (Glob format) which matches the tests that should be loaded. This defaults to “test/test_*.rb”. If you have lots of test files and really only care about a few for a certain class, you may want to specify them using --tests to speed things up.

Also, though I didn’t show it in the examples, Heckle can run against a single method by supplying it after the class name.

If it modifies code, can’t bad things happen?

Well, yes. Heckle could feasibly break things. It throws crap into your code on purpose. It flips unless and while loops so infinite loops will probably occur at some point. For the next release I’m planning to put in some sort of timeout to avoid that.

Additionally, know what your code is doing. If randomly changing a string is going to actually break things irrevocably in testing, you probably should be stubbing those dangerous methods (eg. You probably shouldn’t run Heckle against methods that really delete files during testing if it’s based on a string).

But, does it work with Rails?

You bet your sweet tests. However, you probably want to run against methods by hand since Rails tends to add a whole bunch of methods on the fly (with associations, validations and other helpers) that you wouldn’t want to heckle.

Is there rSpec Support?

I used Test::Unit for my examples, but I’ve been working with Aslak Hellesoy on the rSpec team to make sure support is there, and they’ve added a --heckle flag which should be there in the next version.

Wait, so this is like… testing my tests?

Basically. Cool, huh?

Thanks

A big thanks to Ryan Davis for starting me on this whirlwind, and to he and Eric Hodel for ParseTree and RubyToRuby. Aslak Hellesoy also deserves recognition for his help refactoring the reporting system and his work with rSpec integration.

I’m really excited about this project, and I think it has a lot to offer the testing world. I’m sure there are bugs, so feel free to report them at the rubyforge tracker.

Help spread the word by digging Heckle.

Posted in ,  | 10 comments

Comments

  1. Avatar Faisal N. Jawdat said about 8 hours later:

    Does this rely on ruby2ruby 1.1.2? gem doesn’t find it online, although it looks like one could manually download 1.1.1.

  2. Avatar Kevin Clark said about 8 hours later:

    Sorry, looks like there was a problem with the ruby2ruby push last night. I’m investigating.

    Yes, this will work with ruby2ruby 1.1.1, but there are bug fixes in 1.1.2 that you should have as soon as you can get it.

  3. Avatar Aslak Helles√ły said about 8 hours later:

    Great intro Kevin! See Heckle with RSpec for the RSpec version.

  4. Avatar Kevin Clark said about 9 hours later:

    Ruby2Ruby is having gem propagation issues. Feel free to download the gem directly (http://rubyforge.org/frs/download.php/15738/ruby2ruby-1.1.2.gem) and install via gem install ruby2ruby-1.1.2.gem.

  5. Avatar Laurel Fan said about 9 hours later:

    Ha! This is so cool! (except for the crashing on half of my classes part :))

  6. Avatar Kevin Clark said about 9 hours later:

    Laurel: Thanks for the bug reports earlier! Keep it up as you find them.

  7. Avatar Klondike said about 10 hours later:

    I hope we can look forward to more innovations from the magical team at Powerset. If you’re a search engine, you better get ready to get rocked.

  8. Avatar floyd said about 10 hours later:

    Yes, cool. But you know what would be cooler? A way to test these test tests.

  9. Avatar Klondike said about 10 hours later:

    I don’t understand floyd—would a program like that be testable?

  10. Avatar Kevin Clark said about 12 hours later:

    Heckle will soon be able to heckle itself, if that’s what you’re looking for. It wasn’t able to previously because of a bug in ruby2ruby or ParseTree (I don’t recall) which has since been fixed.

Comments are disabled