Henning’s Blog.

On end-to-end testing

At Doist, our team uses Cypress for end-to-end (e2e) testing core workflows of our application. This sentence already gave life to one of the most important aspects of a good end-to-end test suite: core workflows. Since e2e tests are slow to run (which equals expensive to run) and sometimes can be flaky, you shouldn't rely on them to fully test each edge-case of your app.

Core Workflows

In a cross-functional initiative, we've teamed up with other client teams (Android, iOS), the product team and the support team to identify the most important workflows in our products. Workflows that should never break to ensure a baseline user experience. They range from login and signup, to paid upgrades and basic app functionality like creating tasks or posting threads. Defining these workflows in a cross-functional manner has helped to take the whole picture into account and zoom-out from frontend-only concerns.

After having identified a set of core workflows, it's important to make sure they never break. For this, we run the test suite on every PR. No change is allowed to break a test, and the team gains maximum confidence that their changes are relatively safe to apply.

Test Hygiene

It's in the nature of end-to-end tests to be flaky (with Cypress less so, but it can still happen). Sometimes they're working reliably, and occasionally, you get a test failure. It's important to be very vigilant about each flaky test. A test failure should be treated similar to breaking production. Drop what you're doing, look at it and fix it. At least that would be the ideal way. The day-to-day of engineers looks different: in most instances you'll re-run the test suite, everything passes, and you forget it ever happened (you have more important stuff to deal with anyway). This is a slippery slope to go down.

One of the worst things you can do is start merging changes, although the test suite failed. At that moment, you might brush it off as "it's just this flaky test, totally unrelated to my changes" – but what you're really doing is opening the door to merging PRs with failed status checks and undermining the trust in your test suite. If you're low on time and notice flaky tests hampering the shipping frequency of your team, comment the test out. It sounds radical, after all you're loosing a test and, hence, confidence in the correctness of your changes. On the other hand, how much confidence did a flaky test really provide? Log the broken test somewhere and try to fix it as soon as time allows. This way you'll never get into the habit of merging PRs with failed status checks.