• Lack of robustness
• Random test result
• Fluctuation in execution time
Lack of robustness often comes from incorrectly defined touch points. If a test needs to include a lot of functionality in order to verify a small detail then its bound to lack robustness. Even worse is if a test directly touches implementation.
Adding extra interfaces to create additional Touch Points. |
Tough this doesn't only happen to unit tests it can just as easily, if not easier to functional acceptance tests. If they for instance touch database in validation, instrumentation or data setup you will break your tests every time you refactor your database.
Testers seem to be very keen on verifying against the database. This is understandable since in an old monolith system with manual verification you only have two touch points GUI and db. It's important to work with test architecture and to educate both developers and testers to prevent validation of implementation.
Another culprit is dependencies between implementation and test code. For example Fitnesse fixtures that directly access, model objects, DAOs or business logic implementation.
Tests must be robust enough to survive a refactoring and addition of functionality. Change of functionality is obviously another matter.
We actually managed this quite well with our touch points.
Random test failures are a horrible thing because it will result in either unnecessary stoppage of the line or people loosing respect for the color red.
Result of our Jenkins Job History for Functional Tests could look like this. |
Since its desirable to keep execution times short we reconfigure the system at test time to use much shorter timeouts and delays. This further increases the amount of simultaneous stuff that happens for each request.
When all these simultaneous threads hit the same record in one table you get transactional issues. We had solved this through use of optimistic locking. So we had a lot of rollbacks and retries. But it "worked". But our execution times where very unpredictable and since our tests where sensitive to timing they failed randomly.
Really though did we really congest our test scenarios so much that this became a problem? I mean how on earth could we hit the optimistic lock so often that it resulted in 7 out of 10 regressions failing due to it?
Wasn't it actually so that the tests where trying to tell us something? Eventually we found that our get requests where actually creating transactions due to an incorrectly instances empty list, marking an object as new. We also changed our pool sizing as we exhausted it causing a lot of wait.
So we had a lot of bugs that we blamed on the nature of our application.
Listen to your tests!! They speak wise things to you!
Eventually we refactored that table everyone was hitting so that we don't do any updates on it but rather track changes in a detail table. Now we didn't need any locking at all. Sweet free feeling. Almost like dropping in alone on a powder day. ;)
Fluctuation in execution time. I find it just as important that a test executes equally fast every time as it is that it always gives the same result.
First of all a failed test should be faster then a success, give me feed back ASAP! Second if it takes 15 sec to run it should always take 15 sec, not 5 or 25 at times. This is important when building your suites. You want to have your very fast smoke test suites your bang per min short suites and your longer running suites. Being able to easily group tests by time is important.
It's also important to be able to do a trend analysis on the suite execution time. It's a simplistic but effective way to monitor decrease in performance.
We have actually nailed quite a few performance bugs this way.
Asynchronous nature makes everything harder to test but don't make it an excuse for your self. Random behavior is tricky and frustrating in a pipe but remember TRUST YOUR TESTS!
No comments:
Post a Comment