A Real-World Example of an Automated Test Pipeline

Over the past 15 years, I've been fortunate enough to work as a software engineer for many different companies and startups across the globe. My role is often focused on writing code and producing quality software, but my interests always led me to software testing. Ever since I learned that you could automate testing your code - thanks to Ruby on Rails and its excellent built-in tools - I've been consumed with establishing healthy testing habits for myself and the organization.

Some places I worked at only practiced manual testing. At other places, I was lucky if I found any test case written. In most places, however, we've had some form of test automation taking place, often ensuring that engineers wrote automated tests for our applications. Some organizations had more advance practices in place, like continuous deployment, which automatically updated production systems multiple times a day.

Given my interest in testing (or obsession, according to most of my teammates), I was often responsible for implementing good testing practices in these organizations. Given that I have heard from former co-workers years later that the practices I established were still going strong within the organization, I think I had substantial success.

Different company, same practices

Most of my professional experience is working on web applications. I've worked on Ruby on Rails applications for the majority of my career. I've also gotten my hands dirty with everything from PHP to JavaScript to Go.

Regardless of the frameworks, libraries, and tech stack used in these organizations, I noticed that most test pipelines for web applications follow similar patterns. These pipelines run a similar set of steps for running automated tests.

Of course, every organization is different, and everyone has different steps in their workflows. Some might need additional steps to build and deploy their software. Others might not even have the capability to do fully automated testing like for speech recognition software, for instance. Still, web applications tend to follow the same path on a higher level.

A real-world test pipeline for web applications

I recently wrapped up work at an organization that manages a portfolio of different web applications. A lot of my recent work there was helping improve their testing practices and tools.

This article contains an overview of the testing pipeline used throughout the process. I won't cover the exact process, so this isn't a how-to article. However, as mentioned above, this pipeline follows standard practices, so you can see current trends for testing web apps. Hopefully, this article is useful if you're starting on your web development journey or if you're curious how others perform testing for web applications.

Static code analysis

Static code analysis is the process of checking the code of an application without executing the application. It's a quick check to ensure that the code looks okay on the surface. It doesn't catch bugs but does catch typos, style errors, and a few other code-related problems.

Running static code analysis is almost instantaneous and prevents a lot of small issues, like a misspelled variable or a missing parenthesis around a block of code. These problems are simple to miss, particularly if a code commit contains lots of changes.

It's essential to run static code analysis if your programming language is not strongly-typed. A strongly-typed language ensures that the data type of your variables matches its value, so the program catches mismatched data, which are often a sign of a bug.

If your programming language is not strongly-typed, like Ruby or plain JavaScript, you can accidentally introduce bugs by doing something like assigning a string to a variable when you expect an integer. Static code analysis tools can catch this for you.

Static code analysis also helps with consistency. Many of these tools enforce coding styles. If someone doesn't follow a predetermined coding style, the tool will either fix it automatically or raise an alert. You won't have to spend time asking your team to clean up minor code details like using two spaces instead of four for indentation.

The tools I commonly use for static code analysis are ESLint for JavaScript and RuboCop for Ruby. There are static code analysis tools for just about every programming language. Python has Pylint, PHP has PHPStan, and Go has its built-in command with gofmt. Some tools, like PMD, work across different programming languages.

If you're looking for a static code analysis tool for your workflow, this GitHub repository contains an extensive list of tools for your needs.

Unit tests

After static code analysis, the next step is testing individual functions and components with unit tests.

Unit tests are the smallest test cases to write in any application. You're taking a small piece of your code, often with a few required parameters, and validate its output individually. These tests are short and execute quickly, and are usually the bulk of most automated test cases.

In my experience, most developers these days are consistent enough with writing unit tests for their code. Most web development frameworks contain the tools needed to write unit tests quickly and effectively. However, I have noticed a pattern when teams work on more than one programming language in a project. The team usually focuses on unit testing one part of the application and neglects the other.

For instance, most of the modern web applications I work on use Ruby on Rails on the backend, and React for the frontend. The team would have excellent unit test coverage on the Rails side, but barely any unit tests for the React components. It caused many bugs to slip by and lowered the overall quality of the application. So be aware of the parts of your entire application that would benefit from unit testing.

There are tons of unit testing libraries for any given programming language. My go-to libraries include RSpec for Ruby applications and Jest for JavaScript applications. These libraries are my personal preferences and work great, but there are plenty of other options if you don't like anything about these particular tools. Ruby has other libraries like minitest, while JavaScript has Mocha and Jasmine.

Regardless of your programming language or framework, I encourage you to explore with different tools to see which one feels best for you.

Security tests

Teams often write and execute automated tests to prevent bugs from slipping by. But it's equally important to verify that the code pushed to production is secure. Insecure code looks okay on the surface and functions as expected, but someone can exploit it for sinister motives like gaining unauthorized access to your systems. That's why adding automated security tests is useful in any testing pipeline.

Automated security testing is similar to static code analysis. Its primary purpose is to scan the codebase and identify vulnerabilities in existing code. These tools can find common exploits like SQL injection, cross-site scripting, and remote code execution.

Our primary tool for security testing is Brakeman for our Ruby on Rails backend applications. We explored options for our frontend clients in JavaScript but did not find anything suitable for our needs. If you have any suggestions, please let me know in the comments below.

Also, we did not run security checks in other parts of our infrastructure because we often used managed services like Heroku to run our application. They handle all of these updates automatically, so we don't have to worry about it. If your organization manages its hardware to run your applications, make sure the team is performing regular security checks on those systems and keeping them updated.

Another useful service we use to keep our code secure is Dependabot. This service connects to your code repository and automatically scans your application's dependencies. When a dependency is either outdated, or there's a disclosed vulnerability, Dependabot creates a pull request to patch the insecure code quickly. You don't have to keep an eye out for all the libraries in your app.

Of course, this type of automated security testing is not comprehensive. It alerts you of existing vulnerabilities only, not of any new threats that pop up. Researchers discover new exploits every day, so you still need to do your part. Keep your systems patched, and ensure the libraries your application is using are up to date.

Accessibility tests

Your application might work great, but does it work great for everyone accessing your site? Specifically, can people with limitations such as blindness or cognitive impairments use your app without issues? That's where accessibility testing comes in handy.

Accessibility testing verifies your application to ensure that it's usable by people with disabilities. Depending on your application, there are specific guidelines and rules you can follow to make it easier for everyone to use your app. Some examples are the Web Content Accessibility Guidelines for websites and Android's guidelines for its ecosystem of mobile applications.

This testing is often done manually by going through the application using the same tools that those with limitations use in their daily lives. For instance, you can use a screen reader to make sure people with visual impairments understand how your site is structured. Another example is using your application with your keyboard only since some might not be able to use a mouse or trackpad.

Not everything is manual, in any case. You can automate a large portion of accessibility testing. The most widely-known accessibility testing tool is axe, which you can use with web applications and other platforms. axe is designed to work on any environment.

The best thing about axe is its flexibility. Many programming languages and frameworks have libraries that allow you to use axe in any kind of environment. Some examples are jest-axe if you're using Jest for JavaScript testing, and axe's matchers if you have a Ruby on Rails application. I previously covered accessibility testing in more detail, including using it during end-to-end testing.

Accessibility testing is often left behind, either because of a lack of knowledge about this kind of testing or because it's perceived to be time-consuming. But that should not be an excuse to skimp on accessibility testing. Some tools handle a large chunk of the work for you.

Integration tests

Up to now, our tests cover our code and the functionality of individual sections of the application. But your code often doesn't work on its own. It has to work in harmony with other parts of the application, which is why integration testing is an essential step in the process.

As opposed to unit tests where you only validate one component, integration tests can validate an entire flow of interconnected components. For example, if you're testing blogging software, an integration test can validate the flow to create an article in a blog. If your application processes online payments, an integration test can verify your payment processing logic is correct.

Integration tests are longer to write and run, but these tests are quick since they often don't need to interact with external dependencies. This type of testing is useful because while our individual functions work as expected, you might uncover issues when these functions interact with each other.

It's common that the same tools you use for unit testing also allow you to perform integration testing. The previous tools mentioned, like RSpec, Jest, and Jasmine, allow you to write integration tests alongside your unit tests. It's useful to keep the same tools in place since you and your team can build your test suite without changing the environment.

End-to-end testing

While integration testing is useful, we can take it a step further. Instead of writing tests that interact with a handful of components, we can do end-to-end testing to test everything we can.

Similar to integration tests, end-to-end tests cover longer flows of your application. However, one of the main differences is that it runs the tests under real-world scenarios. That means these tests interact not only with the application under test, but also with any external dependencies such as databases, messaging queues, and third-party APIs. These tests are as close as you can get to how your application behaves once deployed.

End-to-end tests are valuable, but they have a couple of drawbacks to keep in mind. End-to-end tests cover lots of ground and deal with external dependencies, while other forms of testing don't. That means these tests tend to fail for unexpected reasons, and they are slow to run. It's a good reason to keep these tests at a minimum, although you can use them to automate more test coverage. The Practical Test Pyramid is a great article explaining why.

My end-to-end testing framework of choice is TestCafe, as I have covered in many articles on Dev Tester. But there are many excellent choices for end-to-end testing, such as Selenium, Cypress, and Nightwatch.js. These tools help you get your end-to-end tests started and running smoothly.

Manual testing

After all this talk of automated testing, you'd think that it covers all angles of our testing journey. But even if you have 100% automated test coverage for your application - which is far past the point of diminishing returns - it's not enough to only run automated testing. You still need to perform manual testing if you want to make sure new bugs don't creep in.

Automated tests are great for making sure regressions don't exist, like an old bug resurfacing. However, these tests won't catch new or pre-existing undiscovered bugs. It's not uncommon to have new bugs pass by undetected in a code review, or discover a nasty bug hiding in a part of the app that hasn't changed in years. You still need someone to poke around and find what other tests haven't found.

At most places I worked at, the organization had at least one staging server that mimics the production environment. This server is a sandbox where developers and QA can tinker around with the application to check that things are working well outside of their usual working environments during development or testing. The use of a staging server usually happens just before an application goes out to production and the rest of the world.

Some organizations I worked at also spun up temporary servers to test particular functionality. For instance, organizations that use Heroku for their infrastructure and GitHub for their code repository can use their Review Apps service. After setting up the service, Heroku generates a temporary environment for your application when someone creates a GitHub pull request. It allows for manually testing new functionality one piece at a time.

Running manual exploratory tests is still a necessity in today's world, even with all the useful shiny automation tools we have at our disposal. You need to have a real person go through the application and test it in ways that automation hasn't. A computer only does what you tell it to do, so it's excellent at performing repetitive tasks. A person goes exploring beyond the boundaries to ensure that unexpected issues won't pop up at the worst possible moment.

Tying it all together with continuous integration

What good is having a full-blown automated test suite if it's not running all the time? Developers and testers can trigger the tests themselves, but that's not practical. The best tool to put into play is continuous integration.

In simple terms, continuous integration is a practice used by development teams to monitor code changes and kick-off processes to ensure the codebase is healthy. In practical terms, it often means a system that detects when someone commits code and triggers some operation, often including running an automated test suite.

In most teams, their continuous integration system hooks into their code repository. When a change happens, like a commit or a new pull request, the continuous integration system sees it and starts running a pre-defined set of steps. These steps can be anything from setting up the application under test to running the tests to building the application for deployment.

The continuous integration service I use often is CircleCI. It's one of my favorite CI tools to use because it's easy to set up and has lots of functionality, like running processes in parallel. On one recent project I worked on, the team updated our CI process to run different tests simultaneously, reducing total testing time in our builds by half.

CircleCI isn't the only robust continuous integration tool out there. There are lots of excellent services out there like Travis CI, Semaphore CI, and AWS CodeBuild. Or if your organization prefers to keep things in-house, self-hosted continuous integration services like Jenkins or Drone CI fit the bill.

The main benefit of a continuous integration system is that once set up to your organizational needs, it handles all your automation for you. You don't need to rely on someone running your tests since it happens on every code change.

What can we improve in this pipeline?

The different types of testing and pipeline covered here reflect a real-world scenario. It's a lot to take in, and it does take time to get to the point that we got to. The organization was happy with the results and how it helped decrease the number of defects in our applications but also reduced QA time per release cycle by almost half.

Still, there are a few things that, if I had the time and budget, could improve with the testing pipeline.

Implement a continuous deployment workflow. Automated test coverage increases our confidence in our work. The next step to take can be to adopt continuous deployment. Any time new code gets merged into a primary branch, it automatically ships out to production. This process involves a lot more moving parts in addition to testing. But it's worth it because it enforces quality at all levels, leading to better software.
Build most automated test environments close to production. Most automated testing takes place in a development-type environment to aid with debugging. It's excellent for areas like unit and integration testing that have a smaller scope. However, for end-to-end testing, it's best to run them in a production-like environment for better results. In most projects I worked on, we didn't get to that point because it needs time and planning, which we unfortunately didn't spare for this purpose.
Perform canary testing with feature flags. Despite our best efforts, we'll never know how something works 100% accurately in real-world scenarios. Canary testing is the practice of deploying new functionality to production, but only accessible to a select few. By using feature flags to toggle who gets to use what, it's an excellent way to ensure new functionality works in the real world.
Minimizing scripted testing and increase exploratory testing. I've noticed that most projects have a list of test cases for manual QA to follow. Scripted testing has its place, but teams often rely on them too much. I believe that exploratory testing - poking around without any guidelines - is a more effective way to test an application. I previously covered more on this topic.

Summary

I've worked for a lot of companies and startups throughout my career. I've noticed that most companies that have automated testing pipelines for their web application typically follow the same set of steps.

Some of the most common forms of testing steps these days cover static code analysis, unit testing, and end-to-end testing. Depending on your workflow, you might have more types of automated testing or spend more time on manual testing procedures. Other forms of testing, like security and accessibility, are great safety nets. They don't necessarily verify functionality, but they catch issues that often don't surface at first glance.

Of course, all of the testing in the world doesn't matter if you're not running your test suite consistently. Using a continuous integration system ensures these tests execute automatically and alert the team of any issues before the code goes live.

All of the testing steps mentioned above might sound like a lot, and it is. It takes teams a long time to get to this point. However, Rome wasn't built in a day, and your testing pipeline won't be either. The payoff of better software and faster delivery is worth the time and effort. Having healthy testing practices running automatically in your organization improves your delivered product and gives you peace of mind.

What other practices does your organization cover in your testing pipeline? What other improvements do you think most teams should practice? Leave a comment below!

A Real-World Example of an Automated Test Pipeline

Dennis Martinez

Dennis Martinez

Different company, same practices