> Some tests are actually very hard to write — I once led a project that where the code had both cloud and on-prem API calls
I believe that this is a fundamental problem of testing in all distributed systems: you are trying to test and validate for emergent behaviour. The other term we have for such systems is: chaotic. Good luck with that.
In fact, I have begun to suspect that the way we even think about software testing is backwards. Instead of test scenarios we should be thinking in failure scenarios - and try to subject our software to as much of those as possible. Define the bounding box of the failure universe, and allow computer to generate the testing scenarios within. EXPECT that all software within will eventually fail, but as long as it survives beyond set thresholds, it gets a green light.
In a way... we'd need something like a bastard hybrid of fuzzing, chaos testing, soak testing, SRE principles and probabilistic outcomes.
>I believe that this is a fundamental problem of testing in all distributed systems: you are trying to test and validate for emergent behaviour. The other term we have for such systems is: chaotic. Good luck with that
Emergent behaviour is complex, not chaotic. Chaos comes from sensitive dependence on initial conditions. Complexity is associated with non-ergodic statistics (i.e. sampling across time gives different results to sampling across space).
I work in Erlang virtual machine (elixir) and I am regularly writing tests against common distributed systems failures? You don't need property tests (or jeppsen maelstrom - style fuzzing) for your 95% scenarios. Distributed systems are not magically failure prone.
I believe that this is a fundamental problem of testing in all distributed systems: you are trying to test and validate for emergent behaviour. The other term we have for such systems is: chaotic. Good luck with that.
In fact, I have begun to suspect that the way we even think about software testing is backwards. Instead of test scenarios we should be thinking in failure scenarios - and try to subject our software to as much of those as possible. Define the bounding box of the failure universe, and allow computer to generate the testing scenarios within. EXPECT that all software within will eventually fail, but as long as it survives beyond set thresholds, it gets a green light.
In a way... we'd need something like a bastard hybrid of fuzzing, chaos testing, soak testing, SRE principles and probabilistic outcomes.