Holes in testing terminology: Test Types and Test Levels

July 31, 2015 by Stanislav Bashkyrtsev, Julia Atlygina

In Software Testing the terminology is one of the pain points. Engineers tend to use terms differently and there is no single place that can be considered as a source of truth. The only sources that are considered authoritative are: ISTQB, IEEE 829 and SWEBOK. These sources don't always agree and sometimes their definitions simply don't seem logical. The intention of this post is to suggest the terminology that makes most sense from practical viewpoint and to outline the problems in existing "standards". In particular we'll talk about Test Levels and Test Types.

Of course, to communicate with others we don't necessarily need to have a single golden vocabulary. Often it's not feasible (try doing this with Design Patterns). In these situations a good choice would be to know all different sorts of terminology - this means that even though we use different terms, we still understand each other. That's why the article is full of links to other sources, feel free to click on them for extra dose of information.

Basic Definitions of Test Levels & Test Types

Test Level describes how many chunks of our SUT (System Under Test) we initialize for the testing. This can be a class, or several classes or even the whole system. These are the basic levels we discuss in majority of cases:

Unit Tests mostly concentrate on functions and classes Devs write. The tests are run in isolation and don't have any side effects. Doesn't necessarily mean that the objects under test are completely isolated.
Component Tests check whether those classes, when joined together, do what was intended ref. They don't require full app being deployed. Devs usually call them Integration Tests, but in this post term "Integration Testing" is reserved for a completely different thing.
System Tests are run against fully deployed app (system).

Note, that others may have different definitions to the Test Levels ref, ref, ref, but unfortunately they introduced a lot of confusion ref and therefore I wouldn't consider those final. Having commonly used Test Levels doesn't stop us from coming up with our own levels - specific to our project and team. Let's say we have UI working with Back End via Services. Given that we test our functionality via those Services instead of using UI let's define Service Test Level (Martin Fowler calls them Subcutaneous Tests). You may call it differently in your team.

Interestingly, Googlers came up with their own test levels: Small, Medium and Large tests which are pretty neat. Should we use those instead of Unit, Component and System tests? Not sure, but it's worth at least to know about them.

Also, there are a number of different test types ref. We'll talk only about the most popular:

Functional Testing - checks if SUT fulfils the tasks it was written for.
Integration Testing - checks if the app collaborates correctly with other systems. This is the meaning we use when we refer to Integration Testing within this post.
Performance Testing - in simple words it checks how fast the application works. This is an example of Non-Functional Testing.

There are also plenty of other testing types and sorts classified by different criteria or divided even more, but we don't need them for the sake of this post.

The Problem

Looking at the terminology from authoritative sources you may notice couple of problems:

They may not agree on the terms
Their definitions are not logical
There is no system in the naming of test types/levels

First 2 issues are hardly the reason why this post is written. They certainly had impact, but without a bigger problem we'd simply create yet another terminology that others will challenge.

What we really want to create here is a system, a rule. This is similar to scientific theorems they have in physics - one mathematical model that can be applied in all sorts of situations which allows predicting things that we haven't discovered. Current situation in testing is different: test suites, levels, types - they all are named in their own unique way which either cannot be expanded to all the possible situations or make you learn each one of them.

Easy rule to name tests

Types and Levels are not mutually exclusive, they are combined into Test Suites ref. E.g. we could have:

System Integration Testing (SIT) - checks whether our fully deployed app follows the contract (API) that an external system exposes.
Component Functional Testing (CMT) - checks if our app fulfils its functions correctly, but we don't deploy the app fully to check that. We only need one slice of the app to be loaded (e.g. multiple classes + in-memory DB).
Unit Performance Testing (UPT) - checks how fast a small piece of code (class that implements some algorithm) works. Note, that this kind of tests is extremely rare.

If we use aforementioned terminology it boils to an NxM matrix:

	Unit	Component	Service (custom)	System	y
Functional	UFT	CFT	SeFT	SFT	yFT
Integration	UIT	CIT	SeIT	SIT	yIT
Performance	UPT	CPT	SePT	SPT	yPT
x	UxT	CxT	SexT (hehe)	SxT	yxT

Why do we need The Rule?

In Chemistry field they have organization called IUPAC that created a nomenclature chemists use. What is peculiar about it is that it suggests very complicated names of the compounds like (2E)-3-phenylprop-2-enal instead of easier ones like Cinnamaldehyde.

What's the point of this and why this idea became popular? Because it introduce rules that can be used to name anything. It means that instead of memorizing all the different names of thousands of compounds you just need to know the naming rules. And even more - if a new compound is created, you don't have to come up with nice names to it all the time - you can start with a standard IUPAC naming. Of course for a layman these rules are something foreign, but for an educated chemist this is a bless that solves bigger problems! Watch this video from CrashCourse for inspiration.

So here we suggest The Rule to name testing suites/varieties. And it's much easier than IUPAC rules.

Integration Tests is the most misinterpreted type of testing

The Rule should mostly be clear for others because it's intuitive enough. But there is one term that makes life harder - integration testing.

There are so many ways people use this term. And because they got used to one definition or another it's pretty hard to make them use it differently. Before we argue what definition makes more sense, let's list some of them:

Most Devs use this term to denote tests that check a slice of an app (multiple classes working together). In the list above we named those Component Tests.
Integration Tests as Service Tests. Given our app has SOAP/REST services, some QAs would test them via specialized tools. These tests are sometimes called Integration Tests even though in their nature they are Functional Tests.
IEEE 829 definitions:

Integration Testing: "Testing in which software components, hardware components, or both are combined and tested to evaluate the interaction among them. This term is commonly used for both the integration of components and the integration of entire systems."

System Integration Testing: "Testing conducted on multiple complete, integrated systems to evaluate their ability to communicate successfully with each other and to meet the overall integrated systems’ specified requirements."

The term Integration Testing is bad. It's bad because it fits all these definitions. This effectively means that people won't agree on it because everyone will have arguments for their definition and against the definition of others. So which one do we choose?

We can apply a trick here - let's suppose we want to get rid of the term at all. Which definitions we cannot replace with others? We can replace the Devs' definition with "Component Tests" and we can replace the one about REST/SOAP with "Service Tests". Only one of them cannot be easily replaced - when we talk about integration between different systems. And that's how we determine which term fits best. That's why we define Integration Testing as an activity that checks that our SUT would collaborate with other systems correctly.

But then there is another problem with the term - most people got used to the term System Integration Testing which doesn't fit our model at all. You see - according to The Rule word "System" here means that we fully deploy our app while "Integration" means that we check how it would collaborate with others. But it doesn't mean that we test with real integrations working together! We can use Stubs instead of real apps. That might be confusing because phrase "System Integration Testing" itself is something naturally sounding for our ear - given that we don't know the terminology we start imagining that we test multiple real systems working together.

So let's reiterate: Integration Testing is aimed at checking how the app would collaborate with others. This type of testing can be used on different Test Levels - e.g. Component Level or System Level. System Level testing does not mean we deploy multiple real systems - SIT can be achieved via different means including but not limited to working with Stubs or real integrations.

How do we distinguish between SIT with Stubs and SIT with Real Integrations? Well, you can name them as is: "SIT with Stubs and SIT with Integrations". You can always come up with strangely looking abbreviations like SIT/S and SIT/I. There is nothing else you can do without losing information about the actual activities. Alternatively you can come up with your own custom level like MegaSystem tests :)

Big Bang!

It's one thing to test how your app works with others, and it's a different thing to test multiple apps together. The former searches for bugs in your system while the latter is aimed at the behaviour of a cluster of systems which usually requires more integrations to be up and running. This deserves its own test level. In ISTQB they have a very nice term: Big Bang Testing ref which sounds cool, but unfortunately they consider it simply as a synonym to Integration Testing. We may ignore this fact and still use term Big-Bang Level to denote multiple systems testing or we can come up with custom term like Cluster Level.

Is Component Level ideal term?

Here we replaced Integration Testing with Component Level but is our terminology ideal? Unfortunately no - term Component is weird as well. E.g. in IEEE 829 they use it interchangeably with Unit Testing (though none of developers would call it so) and in general word Component is pretty generic. Why then we chose it to denote one of our levels? Two reasons: a) lack of a better candidate b) it's used rarely in common speech and therefore there is a slim chance someone will be upset if we borrow it for our needs.

Software Testing is still a very young and developing discipline and I doubt we'll have solid terminology in the nearest future. So let's use "Component Level" term before a better one is invented.

Conclusions

What we've done in this article is we've shown some holes in testing terminology - the fact that there is no single place that can be considered as source of truth. All current authoritative sources have their problems and there are few people in the wild that use that terminology.

We've also tried to fill these holes with our own suggestions on the naming rules that can be expanded to any situation and organization.

And the last thing - we discussed the most abused term in the Software Testing history - Integration Testing and showed how we can reduce the amount of abuse towards it.

Popup Notes:

Test Level by IEEE 829:

3.1.45 test level: A separate test effort that has its own documentation and resources (e.g., component, component integration, system, and acceptance).

8.2.1 (MTP Section 2.1) Test processes: ... Examples of possible additional test levels include security, usability, performance, stress, recovery, and regression

This becomes a mess since according to this classification you can use term Level to refer to anything in testing.
Test Level by ISTQB

test level

Ref: After TMap

Synonyms: test stage

A group of test activities that are organized and managed together. A test level is linked to the responsibilities in a project. Examples of test levels are component test, integration test, system test and acceptance test.

This is a bit different approach to Test Levels. They consider it as a test stage. This goes hand in hand with our definition until Acceptance Tests. Which is okay to be a test stage, but it doesn't seem natural to call it a Level. Moreover in ISTQB they introduce another similar term Test Phase:

test phase

Ref: After Gerrard

A distinct set of test activities collected into a manageable phase of a project, e.g., the execution activities of a test level.

Though from the English Language perspective the difference between Test Stage and Test Phase is not clear.
Test Level by SWEBOK

Software testing is usually performed at different levels throughout the development and maintenance processes. Levels can be distinguished based on the object of testing, which is called the target, or on the purpose, which is called the objective (of the test level).

Same as IEEE 829 - it calls everything Test Level: both Unit Testing and Functional Testing are Test Levels according to this definition.
Definition of Component by IEEE 829:

3.1.6 component: One part that makes up a system. A component may be hardware or software and may be subdivided into other components. (adopted from IEEE Std 610.12-1990 [B3])

NOTE—The terms “module,” “component,” and “unit” are often used interchangeably or defined to be subelements of one another in different ways depending on the context. The relationship of these terms is not yet standardized.

3.1.8 component testing: Testing of individual hardware or software components. (adopted from IEEE Std 610.12-1990 [B3])

NB: according to this specification component may be considered as a synonym to unit. According to the excerpt Component Integration Testing is what we call Component Testing within current article.
Definition of Test Type by ISTQB

test type

Ref: After TMap

A group of test activities aimed at testing a component or system focused on a specific test objective, i.e. functional test, usability test, regression test etc. A test type may take place on one or more test levels or test phases.
About Test Suites

Of course Test Suite is a broader term - this is just a group of tests. We could split them, nest them, join them. Which means we could have multiple test suites for System Functional Testing e.g. grouped by priorities.
Articles about Test Level confusion
- Test Levels? Test Types? Test Varieties! by TMap
- Test Levels! Really?!
ISTQB Big-Bang Testing

Ref: After IEEE 610 See Also: integration testing

An integration testing approach in which software elements, hardware elements, or both are combined all at once into a component or an overall system, rather than in stages.