The Test Pyramid: The Boston 4th of July Fireworks Analogy

Mary Ann Nazario-Belarmino
9 min readOct 8, 2020
Photo by Designecologist from Pexels

What can the Boston 4th of July fireworks teach us about the importance of the Test Pyramid?

First of all, what is a Test Pyramid?

The Test Pyramid is an industry best practice which suggests that, when building up your test automation suite,

  • most of your tests should be unit tests*,
  • the least should be UI-based functional tests*,
  • and in between, you should have API-based functional tests*.

*(see the brief definitions of these test terms at the bottom of this article)

The Test Pyramid is visually expressed as a triangle or pyramid shape as seen here.

There are countless articles you can read on why this is the recommended distribution for tests, but in a nutshell, the main reason is that, as you go up the pyramid, the overall cost increases and the benefits decrease. That is because UI-based tests (at the top of the pyramid) are known to require high maintenance, they run slowly, and they intermittently fail for reasons that are hard to identify, among other things. On the other hand, unit tests (shown at the base of the pyramid) are very fast, reliable and stable. They also improve your productivity — unlike UI-based tests, when a unit test fails, they easily point you to the offending function or logic in your code.

Despite lots of efforts to convince and educate developers and test engineers about the importance of following the Test Pyramid guideline, most development organizations still end up with an inverted pyramid — the total opposite!

Despite lots of efforts to convince and educate developers and test engineers about the importance of following this guideline, most organizations usually still end up with an inverted pyramid — the total opposite!

Why is that? I am sure there are different reasons, both technical and cultural in nature. But one thing I am sure of is that most software developers do not quite understand the implications (cost and benefits) of having each test type, and why there is a need to invest on some more than others. They likely see the Test Pyramid as some theoretical concept that is nice to read about but is neither practical nor easy to follow.

But alas, I know for a fact that for people to get a better understanding and deeper appreciation of abstract concepts, finding an analogy they can relate to or visualize is very effective. For that matter, I’d like to use the 4th of July Boston fireworks show as a way to deepen your understanding and appreciation of this lowly but all too important “test pyramid” concept.

The Boston 4th of July Fireworks Analogy

Every 4th of July, Boston hosts a fireworks display along Charles River. For me, there are two attributes that describe this show. First, it’s spectacular and very well designed. Second, the show never fails year after year — it is unlikely to get shut down because of some faulty setup or technical problems. It is a well thought out project and the bar on failure is really low, maybe even zero.

If you were given the very important task of testing the Boston fireworks setup to have 100% confidence that it will work as designed on the 4th of July, how would you test it?

The “Trigger Happy” Approach

The most obvious thing you would probably do is hit the trigger button and observe that the fireworks show indeed works as expected from end to end. This is how it will be triggered on the 4th — so why not test it this way, right?

By giving Boston residents a preview of the fireworks show, this approach will make them happy. But will it make your managers happy? Most likely not.

Doing this type of testing does not make sense for multiple reasons.

  • It is very expensive — It will require a lot of staff to set up. It will cost you a lot of money, maybe thousands of dollars worth of pyrotechnics!
  • If anything goes wrong, how would you isolate where the fault is? Where do you start? How can you reproduce it?
  • You cannot test this way multiple times — it will require a lot of effort, time, and money to set it up every time.

For very obvious reasons, it is wise to avoid the “trigger happy” approach at all cost, especially if you value your job.

The “Only Smiles Allowed Here” Approach

But maybe you can scale down the test instead.

Instead of doing a full end-to-end test, you can selectively test a subset of the fireworks, in particular, fireworks that really require visual inspection and fireworks that are critical to the “wow” factor of the show.

For example, nowadays, we see smiley face fireworks used all the time in firework shows. This is an example of a firework you can selectively test, where your goal is to make sure that the batch you got actually produces smiley faces and not sad faces.

The “only smiles allowed here” approach is much less expensive than full end-to-end testing. But note that you will still have the same challenges, just in a smaller scale. But though this approach is still expensive, it is sometimes necessary.

Note that you will still have the same challenges with the “only smiles allowed here” approach as with the “trigger happy” approach, just in a smaller scale. But though this approach is still expensive, it is sometimes necessary.

The “Under The Covers” Approach

The fireworks display that everyone sees is just the visuals of the whole system. On land and on water, there is an elaborate setup controlling the entire fireworks display. That means that if you make sure that the underlying setup is doing what it is designed to do, you’ll have a pretty high confidence that the fireworks show will also work as expected.

To test the underlying setup, you will need to unhook the pyrotechnics. Except for the visual aspect, this approach allows you to still verify the end-to-end behavior of the fireworks setup. In place of fireworks, you may have measurement devices to verify if the proper voltages and signals are getting delivered to the proper destinations at the proper times. To make this testing conveniently repeatable, you should invest on setting up switches that automatically bypass the firing of pyrotechnics, thereby giving you a “no-fireworks” mode.

The “Under The Covers” approach is more scalable and less costly compared to the other approaches mentioned above. It does not require expensive pyrotechnics from being fired. You have some added reliability because you are not relying on the visual fireworks to give you feedback about your system. However, except for the fireworks, this type of testing still requires that everything is up & running — you still need almost the same time and effort to test it.

The “Break It Down” Approach

Or… you can look at this challenge differently.

Instead of testing it as a whole system, why not test the individual components** that make up the whole setup? That is, identify the components that drive the show and then make sure that each one works as expected. By making sure a component behaves as expected or delivers the expected outputs (e.g. voltage) given some set of inputs or triggers, that component will likely do so at show time.

To completely independently test a component, you can fake its dependence on other components by taking off any direct connections and feeding some dummy input instead.

As as long as the individual components behave reliably, i.e., component behavior is consistent after repeated verifications, you can have confidence that the integration of multiple components will behave as expected with high likelihood, thereby avoiding the need to do multiple expensive end-to-end testing. Note however that some tests to verify the integration of components may be necessary to specifically test the “integration points” for latency and other integration-related issues.

This testing approach is much cheaper than all the testing approaches discussed above. In this approach, you can test components in isolation and run tests almost anytime because the cost of testing is significantly small — you do not need the whole crew to set it up.

** In software, the term “component” has a special meaning and it is usually not interchangeable with “unit”. So when I say “testing components in the fireworks setup”, I am referring to the smallest functional unit in the setup and should not be mistaken as “component testing”.

Back to the Test Pyramid

To build confidence that the Boston fireworks setup will work as expected on the the 4th of July, I’ve presented you the different testing approaches you can do. You’ve seen that each approach has both benefits and costs. To do efficient testing, and to avoid incurring unreasonable costs, the fireworks analogy shows that you should invest on more low-level testing and less on high-level testing for efficiency and scale, without losing coverage. It is also important to understand that the goal is not to choose only one testing approach but to employ a combination of them.

This real-life analogy is applicable to software. The approaches mentioned are analogous to the different testing approaches in the test pyramid.

The “trigger happy” and “only smiles allowed here” approaches are the high level end-to-end visual testing or UI-based functional testing. As you’ve seen in the 4th of July analogy, there are cases when this is necessary, it allows us to verify critical use cases and end-to-end scenarios that our software supports. However, such tests should be limited because of its cost and complexity.

The “under the covers” testing approach maps to API-based functional testing. You need to make sure that the underlying logic and components of your software continue to work as expected regardless of how it will be visually presented to the users. With this testing approach, you cut down on cost and improve the reliability of your tests by eliminating reliance of visual rendering. These tests are particularly useful to have if you are often redesigning the user interface but not changing the underlying logic.

The “break it down” approach, which is the testing of individual components* in the fireworks setup, maps to unit testing. These tests are usually small and simple, hence very fast. These tests do not rely on anything outside the “unit” you are testing, hence highly reliable. These tests actually point you to the offending logic or function when it fails, hence root causes are fast and easy to identify and fix. When you cover a good portion of your code base with unit tests, it lessens the need to test a large number of end-to-end use case permutations.

The above analogy has hopefully given you a tangible way of understanding the different types of testing approaches, and an appreciation of the Test Pyramid. You now can see that the Test Pyramid is less of an abstract concept but one that actually makes a lot of sense to apply when testing your software or application.

Some very brief definition of terms

  • Functional testing — testing the functionalities of your application against the desired user acceptance criteria
  • UI-based functional testing — testing the functionalities of your application via the application’s user interface
  • API-based functional testing — testing the functionalities of your application via the application’s API service layer
  • Unit testing — testing units of code or logic in the underlying codebase that make up your application

For more in-depth definitions:

Good reads

--

--

Mary Ann Nazario-Belarmino

Technology Leader in quality management, product ownership, program management, engineering, business analytics, customer success, delivery, reliability.