Aggregate Failures in RSpec
Checking multiple attributes after exercising a method is a common test pattern. The problem is this seems to break a core guideline for testing: one spec has one expectation (http://betterspecs.org/#single). But the key point to remember is that specs should verify a single idea.
It is bad testing two completely different expectations in the same spec. A spec proving a method is called with correct params should not also verify the response; they should be two different specs: “it is called with correct params” and “it receives success response”.
However, the ambiguity of the single expectation guideline comes when testing multiple attributes on the same object to verify state. A refund spec needs to ensure that the status is successful and that the amount is correct. These cases do not violate the guideline because they are verifying a single idea: the response object has correct state.
A reasonable boilerplate for imminent code examples
We’re going to look at refunding a charge using the Stripe stripe-api gem. The Stripe::Refund.create
method is called with the id
of the charge to be refunded. Stripe then returns an API response wrapped in the Stripe::Refund
object.
Stripe::Refund.create(charge: "ch_18vE1uGt3Swi8TrvRRvhDaaV")
#<Stripe::Refund id=re_19PYK0Gt3Swi8TrvrZfMcNJY 0x00000a> JSON: {
"id": "re_19PYK0Gt3Swi8TrvrZfMcNJY",
"object": "refund",
"amount": 100,
"charge": "ch_18vE1uGt3Swi8TrvRRvhDaaV",
"created": 1481466540,
"currency": "usd",
"status": "succeeded"
}
The boilerplate below is assumed for each upcoming example to keep the individual specs focused. Just remember that refund
is the Stripe::Refund
response.
require "rspec/autorun"
describe Stripe::Refund, ".create" do
subject(:refund) { Stripe::Refund.create(charge: "ch_18vE1uGt3Swi8TrvRRvhDaaV") }
# specs will go here
end
To simulate failing specs for all examples, assume the refund response is returning the unexpected attributes amount: 0
and status: "failed"
along with a valid id: "re_19PYK0Gt3Swi8TrvrZfMcNJY"
.
NOTE: There is some hand-waving. A charge needs to be created before each refund spec. You would probably also want to use something like the VCR gem to cache responses so you are not constantly hitting the API.
Example #1: The common way of verifying multiple attributes
Browse any Ruby codebase and you are likely to see a spec like this.
it "returns success response from stripe" do
expect(refund.amount).to eq 100
expect(refund.status).to eq "succeeded"
expect(refund.id).to start_with "re_"
end
There are separate expectations for each attribute. This works, but there is a catch; as soon as an expectation fails, the spec will abort.
F
Failures:
1) Stripe::Refund .create returns success response from stripe
Failure/Error: expect(refund.amount).to eq 100
expected: 100
got: 0
(compared using ==)
# refund.rb:19:in `block (2 levels) in <main>'
Finished in 0.04063 seconds (files took 0.17161 seconds to load)
1 example, 1 failure
Failed examples:
rspec refund.rb:18 # Stripe::Refund .create returns success response from stripe
Notice there is no mention of the status being “failed”. This is because the RSpec does not make it this far; it stops looking when the amount does not match.
Example #2: The simplicity and bloat of the have_attributes matcher
RSpec provides a matcher for checking multiple attributes on an object in a single assertion: have_attributes
.
it "returns success response from stripe" do
expect(refund).to have_attributes(
amount: 100,
status: "succeeded",
id: a_string_starting_with("re_")
)
end
This clarifies what is being tested. Instead of three expectations, a single object is examined to ensure correct state. However, there is a tradeoff. While the spec is reduced to a single expectation, the RSpec failure output is much harder to read.
F
Failures:
1) Stripe::Refund .create returns success response from stripe
Failure/Error:
expect(refund).to have_attributes(
amount: 100,
status: "succeeded",
id: a_string_starting_with("re_")
)
expected #<Stripe::Refund amount=0, status="failed", id="re_19PYK0Gt3Swi8TrvrZfMcNJY">
to have attributes {:amount => 100, :status => "succeeded", :id => (a string starting with "re_")}
but had attributes {:amount => 0, :status => "failed", :id => "re_19PYK0Gt3Swi8TrvrZfMcNJY"}
Diff:
@@ -1,4 +1,4 @@
-:amount => 100,
-:id => (a string starting with "re_"),
-:status => "succeeded",
+:amount => 0,
+:id => "re_19PYK0Gt3Swi8TrvrZfMcNJY",
+:status => "failed",
# refund.rb:25:in `block (2 levels) in <main>'
Finished in 0.02815 seconds (files took 0.17159 seconds to load)
1 example, 1 failure
Failed examples:
rspec refund.rb:24 # Stripe::Refund .create returns success response from stripe
Way more noise; way less helpful. There is a lot of repetition and it is not immediately apparent what went wrong.
Example #3: Testing multiple attributes using one-liners
So maybe this is where single expectation specs come to the rescue? Example #1 only showed a single failure although there were actually two. Example #2 showed all failures but made it less apparent what went wrong due to noise.
We’ll now create a spec for each attribute.
context "when successful" do
specify { expect(refund.amount).to eq 100 }
specify { expect(refund.status).to eq "succeeded" }
specify { expect(refund.id).to start_with "re_" }
end
NOTE: The specify
keyword is just an alias for it
. In this scenario, it reads better.
FF.
Failures:
1) Stripe::Refund .create when successful should eq 100
Failure/Error: specify { expect(refund.amount).to eq(100) }
expected: 100
got: 0
(compared using ==)
# refund.rb:33:in `block (3 levels) in <main>'
2) Stripe::Refund .create when successful should eq "succeeded"
Failure/Error: specify { expect(refund.status).to eq "succeeded" }
expected: "succeeded"
got: "failed"
(compared using ==)
# refund.rb:34:in `block (3 levels) in <main>'
Finished in 0.01369 seconds (files took 0.08941 seconds to load)
3 examples, 2 failures
Failed examples:
rspec refund.rb:33 # Stripe::Refund .create when successful should eq 100
rspec refund.rb:34 # Stripe::Refund .create when successful should eq "succeeded"
Running the specs prints the expected failures: 1 pass, 2 fails. Better in some ways, but there are downsides:
- The error messages are misleading. They specify “Stripe::Refund .create when successful should eq …”, which infers
refund
returns different values in each spec. - A new refund is processed for each spec, needlessly slowing down the test suite.
- It is not easy to see that all failures are for the same spec.
Example #4: Aggregating failure responses for surprise-free state checking
We now come back full circle to the first and most common way; writing three expectations in the same spec… with a twist.
it "returns success response from stripe" do
aggregate_failures "stripe response" do
expect(refund.amount).to eq 100
expect(refund.status).to eq "succeeded"
expect(refund.id).to start_with "re_"
end
end
Notice the aggregate_failures
method? This was added in RSpec 3.3. It prevents a spec from aborting after the first failure. All expectations inside the block are run and compiled into a useful summary.
F
Failures:
1) Stripe::Refund .create returns success response from stripe
Got 2 failures from failure aggregation block "stripe response".
# refund.rb:45:in `block (2 levels) in <main>'
1.1) Failure/Error: expect(refund.amount).to eq 100
expected: 100
got: 0
(compared using ==)
# refund.rb:46:in `block (3 levels) in <main>'
1.2) Failure/Error: expect(refund.status).to eq "succeeded"
expected: "succeeded"
got: "failed"
(compared using ==)
# refund.rb:47:in `block (3 levels) in <main>'
Finished in 0.01508 seconds (files took 0.0881 seconds to load)
1 example, 1 failure
Failed examples:
rspec refund.rb:44 # Stripe::Refund .create returns success response from stripe
Aggregating failures give us some great benefits over the previous three examples:
- A single failure indicating the response was incorrect.
- Refund request only made a single time; fast tests!
- All expectations run; no hidden failures.
- Clean expected/got breakouts; less noise; more skimmable.
Aggregate failures with an RSpec tag
As an alternative to the block syntax, you can use the :aggregate_failures
tag.
This will aggregate all failures within the spec without changing the spec structure:
it "returns success response from stripe", :aggregate_failures do
expect(refund.amount).to eq 100
expect(refund.status).to eq "succeeded"
expect(refund.id).to start_with "re_"
end