Continuous Delivery: Continuous Delivery

Showing posts with label Continuous Delivery. Show all posts

Tuesday, January 15, 2013

Package power!

We often talk about pipe design and how to implement it in jenkins or other ci tools, that everything should be versioned and that everything should be tested all the time. These things are very important but something I didn't realize for quite some time was how important packaging is.

Our packaging was giving us problems.

Early on when building our continuous delivery pipe we where a bit worried about the number of artifacts we where spewing out of our pipe and the impact it would have on our nexus repo. So we did release our war and jar files into our repo but the final deliverable assembly we released was just a property file containing versions. These property files where used by our rudimentary bash deploy scripts. The scripts basically did a bunch of wgets to retrieve the artifacts from the nexus repo before deploying them. Yeah laugh you I now know how dumb this was.

Our main problem due to this was that our scripts where very delivery specific. For delivery Y we had components A, B and C while for delivery Z we had components A, D and E. We couldn't reuse things well enough so we had duplicates of our scripts. Another issue we had was that there was no portability in this what so ever. We didn't really make the connection between lack of packaging and our huge developer environment problems. Switching between working on delivery X and Z was tedious because we where managing the local deployments in eclipse with the JBoss plugin. It also required full understanding of what components needed to be deployed.

Manual tasks and a required high level of domain knowledge didn't make things easy for our new developers. In act it also made life a pita for our architects that develop less hours a week then the developers. For them the rotting of the development environment was a huge issue. Since all components where managed manually all had to be updated, built and deployed.

Inspiration and goals.

When me and my colleague where at QCon NY (awesome conf that everyone should try to attend) we listened to talks by Netflix and Etzy. We where totally blown away by two things. Etzy's practice that a new developer should code and deploy a production change on the first day and Netflix baking of images instead of releasing wars and ears. These where two of the main things we brought back with us and two things that we keep revisiting as we iterate our process.

Since we don't do continuous deploy we set the goal that a new developer should be able to commit a change that is ready for delivery on the first day. The continuous delivery part of the goal wasn't the problem since we already had that in place. It's the most obvious part of that goal. The next obvious task for us was that we really had to do something about our dev env setup. Then with some thought we realized that this wasn't enough we needed to do something about our entire on boarding process with mentoring and level of knowledge in the team. In order to mentor someone a developer needs to have a good understanding of most tasks in jira. At this stage this wasn't the case.

We made the knowledge increase our priority since this was biting us in many ways. I won't go much more into that. Then we tried to prioritize the setup of our developer environment but doing something about our deploy scripts ended up being a higher priority. This was a very good and honestly lucky decision. We knew how to do our deploy script changes and our production deployments where really more important. But we where also not sure how to do our developer environment changes so sleeping on it was what we decided, even though our devs where literally screaming in frustration.

Addressing the problems.

First thing we did when we started to rewrite our scripts was to sort out our packaging once and for all. We killed the property file and started using maven for everything. We had already been using maven to release all components and most configurations. But we where not using maven to package our final deployables and we where not using it to release our deploy scripts. We had already been made very well aware that we had to tie our deploy scripts to our deployable assembly. We changed both these things. We started to release everything and not just versioning everything. This imho is very important thing that's not mentioned enough. Blogs, articles and demos talk about versioning everything but not so much about the importance of actually releasing everything and treating each release as an artifact even if its "just" a httpdconf.

Once we started building these packages and setting our structure it was so clear how Netflix came to the conclusion that they should bake images. The package contains war files, config files, deploy scripts, liquibase scripts, custom JBoss control scripts, httpdconf, ect, ect. The more we package and the more servers we get in our park the more things we notice that we need to put into the package. Then it becomes even more obvious since we take this package and transfers it to tons of servers for different test purposes. Once at the server we run our deploy scripts that copy and link stuf into place on the server. Remind me why are we doing this over and over? Wouldn't it be better to just do this once and make an image out of it and mount this image on different nodes. Of course it would be, Netflix know what they are talking about! Most importantly it would bring the final missing pieces into the package JBoss, Java and Linux distributions. Giving us the power to actually roll out and test even OS patches through the same process as any other change. We arnt there yet, but the path is obvious and its nice to feel that what was once an overwhelming w000t is now a definite possibility.

So through a good packaging strategy we managed to improve and solve our deploy script problems. We now had one script to distribute and deploy them all! This also resulted in much fewer changes to the deploy scripts which in turn made them more stable. A lot of changes that previously required changes to deployment scripts now just requires a change to the packaging which makes the entire deployment process much more robust.

Portability!

Still though we hadn't solved our issues with our developer environments. I had the hunch for some time that our packaging could help us. Still it took us some time before we realized that we actually had created an almost fully portable deployment solution. Our increased maven usage had made us so portable that we could actually just write a simple script that combined the essence of the assembly job and the deployment job of our jenkins pipe into a local dev env script. By adding "snapshots true" to our maven version properties update we allowed our assemblies to be built including snapshots. Then we could just use our deploy scripts and voila our local JBosses and Mule ESBs where deployed with artifacts containing our code changes and most importantly our rebel.xmls, giving us full JRebel power with our production deploy scripts.

Our packaging strategy had made our continuous delivery process portable to our development environment allowing us to use the same assemble+deploy from local dev env to prod. Our developers now just need to know what assembly to deploy and they don't need to rebuild all included components just the ones they are currently working with, the others are added by maven for he nexus repo. So now our developers can quickly and easily switch between single component deploys and full deliveries.

Getting closer to our goals.

By adding JBoss & Mule installations to the script we further simplified the setup process for the new developers. We still have a few things we want to add to the script such as IDE install and initial source code checkout in order to simplify things further but at will have to rest it a bit since we have other higher priorities. Still we have taken huge steps towards our Etzy inspired goal of having new developers commit a code change on the first day.

It feels like all these levels of improvement have been unlocked by a good packaging strategy!

If its one thing I would change about the way we have gone by our implementation its the packaging. It's easy to say in hindsight but I'd really try to do it properly of the bat.

Wednesday, January 9, 2013

Test for runtime

Traditionally our testers have been responsible fore functional testing, load testing and in some cases for some failover testing. This covers our functional requirements and some of our supplemental requirements as well. Though it doesn't cover the full set of supplemental requirements and we haven't really taken many stabs at automating these in the past.

The fact that we haven't really tested all the supplemental requirements also leaves a big question, who's responsibility is verification of supplemental requirements? Lets park that question for a little bit. To be truthful we don't really design for runtime either. Our supplemental requirements almost always come as an afterthought and after the system is in production. They always tend to get lost in the race for features to get ready.

In our current project we try to improve on this but we are still not doing it well enough. We added some of the logging related requirements early but we have no requirement spec and no verification of the requirements.

The logging we added was checkpoint logging and performance logging. Both these are requirements from our operations department. The checkpoint logging is a functional log which just contain key events in an integration. It's used by our first line support to do initial investigation. The performance log is for monitoring performance of defined parts of the system. It's used by operation for monitoring the application.

Lets use user registration as an example (its a fictive example).

1. User enters name, username, password and email into a web form.
2. System verifies the form.
3. System calls a legacy system to see if the email is registered in that system as well.
3a. If user registered in legacy system with username and password matching the userid is returned from that system.
4. System persists user.
5. Email is sent to user.
6. Confirmation view displayed.

From this we can derive some good checkpoints.

2013-01-07 21:30:07:974 | null | Verified user form name=Ted, username=JohnDoe, email=joe@some.tst
2013-01-07 21:30:08:234 | usr123 | User found in legacy system
2013-01-07 21:30:08:567 | usr123 | User persisted
2013-01-07 21:30:08:961 | usr123 | User notified at joe@some.tst

The performance log could look something like this.

2013-01-07 21:30:07:974 | usr123 | Legacy lookup completed | 250 | ms
2013-01-07 21:30:08:566 | usr123 | User persisted | 92 | ms
2013-01-07 21:30:08:961 | usr123 | User registration completed | 976 | ms

This is all nice but who decides what checkpoints should be logged? Who verifies it?

Personally I would like to make the verification the responsibility of the testers. Though I've never been in a project where testers have owned the verification of any kind of logging. This logging is in fact not "just" logging but system output, hence should definitely be verified by the testers. By making this the responsibility of the tester it also trains the tester in how the system is monitored in production.

So how do can this be tested?

Lets make a pseudo Fitnesse table to describe the test case .

| our functional fixture |
| go to | user registration form |
| enter | name | Ted | username | JohnDoe | email | joe@some.tst |
| verify | status | Registration completed |
| verify | email account | registration mail received |

This is how most functional tests would end. But let's expand the responsibility of the tester to also include the supplemental requirements.

| checkpoint fixture |
| verify | Verified user form name=Ted, username=JohnDoe, email=joe@some.tst |
| verify | User found in legacy system |
| verify | User persisted |
| verify | User notified at joe@some.tst |

So now we are verifying that our first line support can see a registration main flow in their tool that imports the checkpoint log. We have also taken responsibility of officially defining how a main flow is logged and we are regression testing it as part of our continuous delivery process.

That leaves us with the performance log. How should we verify that? How long should it take to register a user? Well we should have an SLA on each use case. The SLA should define the performance under load and we should definitely not do load testing as part of our functional tests. But we could ensure that the function can be executed within the SLA. More importantly we ensure that we CAN monitor the SLA in production.

| performance fixture |
| verify | Legacy lookup completed | sub | 550 | ms |
| verify | User persisted | sub | 100 | ms |
| verify | User registration completed | sub | 1000 | ms |

Now we take responsibility that the system is monitor able in production. We also take responsibility and officially define what measuring points we officially support and since we do continuous regression testing we make sure we don't break the monitor ability.

If all our functional test cases look like this then we Test for runtime.

| our functional fixture |
| go to | user registration form |
| enter | name | Ted | username | JohnDoe | email | joe@some.tst |
| verify | status | Registration completed |
| verify | email account | registration mail received |

| checkpoint fixture |
| verify | Verified user form name=Ted, username=JohnDoe, email=joe@some.tst |
| verify | User found in legacy system |
| verify | User persisted |
| verify | User notified at joe@some.tst |

| checkpoint fixture || performance fixture |
| verify | Legacy lookup completed | sub | 550 | ms |
| verify | User persisted | sub | 100 | ms |
| verify | User registration completed | sub | 1000 | ms |

Saturday, January 5, 2013

Continuous Delivery and DevOps in a legacy organization

I've been using the term legacy organization. My definition of a legacy organization is a slow changing organization that separates professions in silos. The slow changing nature can but doesn't have to be due to sizes. The separation of professions into silos materializes into a process where responsibility is handed over from profession to profession.

I have intentionally put development and test into same box. In some legacy organization you see theses separated into two silos where development hands over to a QA department which tests the application. I don't want to say its impossible to do continuous delivery with that type of setup because nothing is impossible. It requires the development organization to start taking responsibility for testing. It can be done by smart recruiting of developers with test focus but its going to be hard.

I refere to the above setup as legacy noDevOps organization because it separates development and operations and suffers heavily from the wall of confusion syndrome but it is an organization where test driven development is possible. Two of the biggest issues in a legacy noDevOps organization is the gunpoint standoff and droped responsibility at the wall. The standoff results in unconstructive blame games and lack of constructive change.

The dropped responsibility comes when development just wants out of responsibility at the point of handoff. Project managers want to close the project. Developers want to do new cool stuff. So development picks a few members who get to run at the wall when the rest hide. At the wall the mudball of a deliverable is tossed over the wall hoping that someone on the other side catches it.

A lot of talks and writeups on continuous delivery more or less assume a DevOps organization. Its definately much easier since continuous delivery requires the uses of same deployment mechanisms in all environment, which in turn puts a high requirement on similarity in infrastructure . Building a good process without the help of the direct involvement of the infrastructure experts in the operations organization is extremely hard. Doing continuous delivery well requires a higher level of continuous responsibility by the developers. DevOps allows developers to take responsibility in production, which is hard in legacy noDevOps organizations. So yes obviously continuous delivery is made so much easier with DevOps.

But what should we do? Should we just sitt there and wait till a manager calls a meeting and says we are gonna start doing DevOps and CD. If that happens then the DevOps is gonna be so full of friction because our professions are still at gunpoint standoff. So before anything gets done everyone needs to lower their guns and start trusting each other, this will take time.

Its my firm belief that the standoff is always the "fault" of the development organization. If we would have been delivering high enough quality in a stable enough application then there would not have been any standoff and there would have been trust. We can argue all we want that it's not possible to deliver enough stability and quality from a development organization without help and change from operations but it's beside my point. We can only change our own behavior and we can only do that by being the change we want to see.

If we want to deploy more often in order to archive higher quality then we make sure to hold our end of the bargain, higher quality and stability deliveries. We start by taking active responsibility for quality and stability through continuous regression testing. We test our deployments one million times if that's what it takes to make a stable deployment. We I prove with each delivery. We take pride in learning from our mistakes and automating tests to ensure they don't happen again. Then overtime the trust will increase and the teams will start cooperating more and more.

The development organization is in charge of the full delivery process up to the wall of confusion. So make it the best delivery possible and take pride in delivering high quality out of the development silo. For each successful delivery you bring the wall down one brick at the time.

Also remember that we are talking about continuous delivery, not deployment. It's super important not to ever speak about continuous deployment because it scares the living crap out of the ops team when in a standoff situation. Though always having a deliverable ready and tested at the wall is always going to be appreciated. Then transition into production can happen with less confusion.

I have to confess I'm one of the developers who hates to support an in production application. I fear to be on call and once an application is in production I want to change assignment. Reason is how legacy noDevOps organizations go about developer support. Developers have zero trust so we can't access logs, databases or anything in production. So each time a developer needs to help out with a production issue its with tied hands and a blind fold. It ends up becoming a hostage situation where the developer is held hostage. I love to trace down bugs, solve issues and improve stuff but to be able to do that I need my eyes, my brain and my hands.

We can take charge of this situation as well and stop beeing victims. We can drastically improve our situation by building monitoring and metrics into our application and verifying them as part of our continuous regression testing. This way we build tools that are gonna be available in production that operations are gonna require anyway. Usually these are added late and as low priority supplemental requirements from operations. By being proactive we can build this into our architecture, test process and use it through out our entire delivery process. This way we build more useful monitoring tools that we understand much better. In return we arnt as blind and handicapped when helping out with production issues. Once again we make active steps towards cooperation and trust between organizations while making our own life better.

DevOps makes continuous delivery easier. But continuous delivery can be how we drop the guns and tear down the wall of confusion in a legacy organization and move towards DevOps. Ultimately they should both exist in an organization and I think they will both become as common as agile even in large old organizations.

How ever until we are there I think that continuous delivery is a fantastic tool to enable change in an organization suffering from a deadlock. It requires courage, vision, ambition and patience but all the tools are there for us to start making that change today!

Tuesday, January 1, 2013

Upps our Continuous Delivery process became mission critical

At some point something changed with our Continuous Delivery process, it became mission critical. When we started working on the process it was basically a side project that another Tomas and I had. We added a consultant early in our project and he ended up doing some of the work on the first version of our deployment scripts but it wasnt anything organized and not part of any proccess or tools team.

When we increased the number of developers and started seeing issues with stability and scaleability we also started to realize that our process had become mission critical. In fact or continuous delivery process had become more important to us then our mail system.

Now we had a mission critical hobby project with the following setup.

No official Owner.
No official Developers.
No official Operations professionals involved

Operations only supporting the OS of the Jenkins and nexus instance.

One "live" instance of Jenkins on a super small virtual node.

All development done on live instance.

One "live" instance of Nexus with a very small disk.

All development done on live instance.

Small number of test servers, virtual but not cloud nodes.

Having about 30 developers really depending on a process that is setup like this is obviously a no go.

We started to figure we need to put more effort into it when we where to do our first rewrite of our deploy scripts. Still we didn't think in terms of production mission critical system. We needed a resource and I kept insisting we needed a CM, more on that in an upcoming post. We had architecture and test working together building the application around the process. But we needed some more hands building the deploy scripts and also someone who could help us with the complexity of our system configuration. As I wrote in the entry on deploy scripts this didn't work out well at all. Mostly because the CM ended up working alone in a corner of the organization but also because he didn't share our vision of continuous delivery. Between all discussions trying to get us to implement branching strategies he was writing deployment scripts without any JBoss or DB competence. Obviously this didn't work out all that well and it was during this script rewrite that we started to realize that our process was mission critical. The new deploy scripts where very unstable and as mentioned our tests had stability issues.

Now we started realizing that we have a mission critical system at our hands and we need to start treating it as such. Still this was a bit of an unknown entity in our landscape operations only support our office it and our customer deliveries while development supports tooling. While this for sure falls into tooling department the development organization isnt equipped to support a mission critical system. Still we had to do something about it so this was when we created our tools team, we refer to it as a platform team as it was intended to own certain components such as logging, help desk, ect. But main focus was to be continuous delivery. Our lacking development environment was another area of responsibility that we moved to this team, more on that as well in another entry.

The team consisted of our CM, application DBA, a newly added senior Java developer and my self as architect/lead. It was obvious from the onset how effective it is when you have resources (with full range of competence) that can focus on the process. This made us much more responsive to bugs in the process and faster in implementing changes.

We still at this date have not solved all the infra structure issues but most of it is being worked by the tools team and a new resource in our operations department who is responsible for our tooling serves. Still we don't have a Jenkins test environment and still the operations responsibility of Jenkins and Nexus aren't really well defined. But we have resources dedicated to the process and when something isn't working we handle it as bugs.

The biggest lesson is that its really important to get dedicated resources from dev and ops early. Getting two 50% resources is better then one full time as one isolated resource is a huge bottleneck and has a hard time prioritizing his work. Also make sure to have a bug/enhancement process in place early. Priorities should be made based on user experience, same as with any system in production. Also as soon as the process is in use by the developers you need a test environment for Jenkins (or what ever build server you use to drive the process) as its a production system after all.

I think the reason we got a bit blindsided by the process becoming mission critical is that we haven't had anything similar in our landscape before. There is actually one thing that has grown mission critical at about the same rate hand in hand with our CD process and that's our JIRA server. In fact we have an even bigger dependency on our JIRA if it goes down our developers have no clue what to work on and get stranded very quickly. For us this is a new type of mission critical systems. Previously they have only been supporting systems.

Another reason is that the continuous delivery community talks about how easy it is to get started and how we can just take small baby steps from our nightly build CI. It is both true and the way to go. I just guess I wasn't reading the fine print which says "and then it becomes mission critical".

Wednesday, December 26, 2012

Process Scaleability

When we started working on our continuous delivery process our team was very small, three devs in two sites in different time zones. During the first six months we added two or three developers. So we where quite small for quite some time.

Then we grew very quickly to our current size of about thirty developers and eight or so testers. We grew the team in about six months. Obviously is provided huge issues for us with getting everyone up to speed. This exposed all the flaws we have with setting up and handling our dev environment. But not only that it also exposed issues with scaleability of our continuous delivery process.

With the increased number of developers the number of code commits increased. Since we test everything on every code commit our process started stacking test jobs. For each test type we had a dedicated server. So each deploy and the following test jobs had to be synchronized resulting in a single threaded process. This didn't bother us when we where just a few code committers but when we grew this became a huge issue.

Dedicated Test Server beeing the bottleneck

The biggest issue we had was that the devs didn't know when to take responsibility. If the process scales then the time it takes for a commit to go through the pipe is identical regardless of how many commits where made simultaneously. The time of our pipe was about 25-30 min. Bit long but durable IF it would be the same time for each commit. But since the process didn't scale the time for a developers checkin to go through was X*25 min where X=number concurrent commits.

This was perticularily bad in the afternoon when developers wanted to checkin before leaving. Sometime a checkin could take up to two three hours to go through and obviously devs wouldn't wait it out before leaving. So we almost always started the day with a broken pipe that needed fixing. Worse yet our colleagues in other timezones always had broken pipes during their day and they usually lacked the competence to fix the pipe.

Since the hardest thing with continuous delivery is training developers to take responsibility it's key that its easy to take responsibility. Visibility and feedback is very important factors but its also important to know WHEN to take responsibility.

The solution was obviously to start working with non dedicated test servers. Though this was easier said then done. If we would have had cloud nodes this would have been a walk in the park to solve. Spawning up a new node for each assembly and hence having a dedicated test node per assembly would scale very well. But our world isn't that easy. We don't use any cloud architecture. Our legacy organization isn't a very fast adopter of new infrastructure. This is quite common for most large old organizations and something we need to work around.

Our solution was to take all the test servers we had and put them into a pool of servers and assign them to testing of an assembly at the time.

Pipe 1 has to finish before any other thread can use
that pooled server instance.

This solves scaling but provides another problem we need to return servers into the pool. With cloud nodes you just destroy them when done and never reuse. Since we do reuse we need to make sure that once a deploy starts on a pooled server all the test jobs get to finish before next deploy starts.

We where quite uncertain how we wanted to approach the pooling. Did we really want to build some sort of pool manager of our own? We really, really didn't because we felt that there has to be some kind of tool that already does this.

Then it hit us. Could we do is with jenkins slaves? Could our pool of test servers be jenkins slaves? Yes they could! Our deploy jobs would just do a local host deploy and our test jobs would target local host instead of the ip of a test server.

The hard part was to figure out how to keep a pipe on the same slave and not have another pipe hijack that slave between jobs. But we finally managed to find a setup that worked for us where an entire pipe is executed on the same slave and jenkins blocks that slave for the duration of the pipe.

As of writing this post we are just about to start re-configuring our jobs to set this up. Hopefully when we have this fully implemented in two weeks or so we will have a process that scales. For our developers this will be a huge improvement as they will always get feedback within 25 min of hit checkin.

Monday, December 17, 2012

Test stability

The key to a good Continuous Delivery process is a stable regression suite. There are a few different types of instability on can encounter.

• Lack of robustness
• Random test result
• Fluctuation in execution time

Lack of robustness often comes from incorrectly defined touch points. If a test needs to include a lot of functionality in order to verify a small detail then its bound to lack robustness. Even worse is if a test directly touches implementation.

Adding extra interfaces to create additional Touch Points.

I'm sure everyone has broken unit tests when refactoring without actually breaking any functionality. This almost always happens when unit tests evaluate implementation and not functionality. For example when you just split methods to clean up responsibility.

Tough this doesn't only happen to unit tests it can just as easily, if not easier to functional acceptance tests. If they for instance touch database in validation, instrumentation or data setup you will break your tests every time you refactor your database.

Testers seem to be very keen on verifying against the database. This is understandable since in an old monolith system with manual verification you only have two touch points GUI and db. It's important to work with test architecture and to educate both developers and testers to prevent validation of implementation.

Another culprit is dependencies between implementation and test code. For example Fitnesse fixtures that directly access, model objects, DAOs or business logic implementation.

Tests must be robust enough to survive a refactoring and addition of functionality. Change of functionality is obviously another matter.

We actually managed this quite well with our touch points.

Random test failures are a horrible thing because it will result in either unnecessary stoppage of the line or people loosing respect for the color red.

This is where we have had most our issues. At one point our line was failing 8 of 10 times it was executed and at least 7 of these where false negatives. "Just run em again" was the most commonly heard phrase. Obviously the devs totally lost respect for the color red. The result was that when now and again a true bug caused the failures no on cared to fix it and they started stacking.

Result of our Jenkins Job History for
Functional Tests could look like this.

So why did we have these problems? First of all our application is heavily asynchronous so timing is a huge issue in our tests. Many of our touch points are asynchronous triggers that fire off some stuff in the application and then we wait for another trigger from the application to the test before we validate. We don't really have much architectural room here as its an industry standard pattern. This in it self isn't a big deal except that each asynchronous task schedules other asynchronous tasks. So a lot of things happen at the same time.

Since its desirable to keep execution times short we reconfigure the system at test time to use much shorter timeouts and delays. This further increases the amount of simultaneous stuff that happens for each request.

When all these simultaneous threads hit the same record in one table you get transactional issues. We had solved this through use of optimistic locking. So we had a lot of rollbacks and retries. But it "worked". But our execution times where very unpredictable and since our tests where sensitive to timing they failed randomly.

Really though did we really congest our test scenarios so much that this became a problem? I mean how on earth could we hit the optimistic lock so often that it resulted in 7 out of 10 regressions failing due to it?

Wasn't it actually so that the tests where trying to tell us something? Eventually we found that our get requests where actually creating transactions due to an incorrectly instances empty list, marking an object as new. We also changed our pool sizing as we exhausted it causing a lot of wait.

So we had a lot of bugs that we blamed on the nature of our application.

Listen to your tests!! They speak wise things to you!

Eventually we refactored that table everyone was hitting so that we don't do any updates on it but rather track changes in a detail table. Now we didn't need any locking at all. Sweet free feeling. Almost like dropping in alone on a powder day. ;)

Fluctuation in execution time. I find it just as important that a test executes equally fast every time as it is that it always gives the same result.

First of all a failed test should be faster then a success, give me feed back ASAP! Second if it takes 15 sec to run it should always take 15 sec, not 5 or 25 at times. This is important when building your suites. You want to have your very fast smoke test suites your bang per min short suites and your longer running suites. Being able to easily group tests by time is important.

It's also important to be able to do a trend analysis on the suite execution time. It's a simplistic but effective way to monitor decrease in performance.

We have actually nailed quite a few performance bugs this way.

Asynchronous nature makes everything harder to test but don't make it an excuse for your self. Random behavior is tricky and frustrating in a pipe but remember TRUST YOUR TESTS!

Wednesday, December 12, 2012

Deploy scripts, how hard can it be?

Obviously an integral part of a Continuous Delivery process is to get the artifacts deployed. The same deployment procedure should be used for every deploy. Every deploy means the deploy for every component test, functional acceptance test, load test, rollback test, user acceptance test and yes production as well.

For us this presented our first big challenge. We wanted the same deployment mechanism for all deployments made into environments owned by e development organization and into production owned by operations. Yes we are a legacy NoDevOps organization and there is no changing that anytime soon. I'll cover is more in another post. Basically each project deals with deploy scripts in their own way some have sort of same scripts for all environments some don't. We wanted to change this.

We wanted two things same mechanism for all deploys and triggered the same way. This meant at we had to agree on how to do it with our operations department. This also basically ruled out any sort of third party tooling like chef, or what ever. We felt that we didn't have the leverage and mandate to push a tool on them. We where crossing our fingers that they wouldn't shoot down our proposal to trigger deploys from jenkins.

They actually didn't shoot down using jenkins but they forced us to set up a jenkins at e developers wouldn't have access to. As in most legacy organizations devs aren't allowed to touch a production deploy. They also had some really good input on our initial rudimentary deploy scripts.

We had written some rudimentary scripts early on just to get something deploying. These where quite non-generic hard coded bash scripts. They handled the transfer of artifacts from nexus to the target server, running of liquibase scripts and restart of jboss and mule servers.

So we had the input from operations how to make the scripts better (required to use for prod deploy) and our desire to make the scripts generic. How hard can it be? It's just moving some wars and stuff to a server where we need to put it in the right place and then restart some stuff.

We decided at we first wanted to do the changes operations required as that would allow us to deploy our first delivery to productions. Problem was that while we where doing this we got more deliveries and more components that needed deploying to component test servers. So we ended up with huge set of copy pasted scripts in different stages of development. This bag of copy pasted scripts required maintenance and slowed our development of the main line scripts.

This is really a sad story in our development. It took us several months to get the scripts rewritten to match the requirements of operations. By the time this was done we so desperately needed our generic scripts that we had to throw out the mudball that our scripts had turned into and rewrite them again. This time it went much faster few weeks but the migration of all the other components and deliveries that used different versions of old scripts took quite some time as well.

Here are a few reasons we got into this mess (other then not having a DevOps organization).

1. We didn't have a standardized packaging strategy for our components and deliveries, making it hard to generalize scripts.
2. Having a script developer who was good with Linux but no JBoss, Mule or liquibase skills AND not pairing him with a developer possessing these skills.
3. Starting too late with the rewrite of the rudimentary scripts. We knew for a few months what we had to do but didn't do it.
4. Last but not least leaving rollback mechanisms and rollback pipe building to the last minute.

Get your packaging sorted out early and get your deployment mechanism in place well in time before first production deployment. Also have your rollback strategy sorted out and in testing as part of your pipe early, well before production. I'll cover rollback in another post.

So no deploy scripts aren't that hard if you do things in the right order and don't kill your self by doing mistakes on all levels.

Saturday, December 8, 2012

The impact of Continuous Delivery on the role of the Tester

Continuous Delivery really does change way we work. It's not just tools and processes. We know that because every talk, every blog says it does but when you really see it its quite interesting.

The changes to the level of responsibility required by each developer has been one of the hardest thing to manage. Don't check shit in or you will break the pipe! Don't just commit and run, it's your responsibility that the pipe is green! All that is frustrating and has been the single most time consuming on our journey, but its worth a post of its own.

What I'd like to focus on is the changes to he role of the tester. Previously we have worked mostly with manual GUI driven testing. Our testers have tons of domain knowledge and know our systems inside out. We have worked with test automation in some projects and I have some test automation experience from the past. I've been trying to champion test automation for years but we have had a bit of a hard time getting it of the ground. When we have its not been test automation but rather automated tests that require some kind of tester supervision.

In our current project, as I've written about in previous posts , we decided to go all in. When we started we where a team of just architects. I was leading the work on the test automation. We got really good results working a lot with test architectur and test ability architecture in our application. But after some time we started to suffer from from lack of tester input in our testing. We obviously ended up with a very happy case scenario oriented suite.

So we started to bring on testers to our project but what profiles should we look for. First we started to look for just our notion of what a tester was and had always been in our projects. We happily took on testers with experience of testing and test managing portals, order systems, corporate websites ect, ect. But our delivery at this time didn't have a GUI all testing was done using Fitnesse. We didn't really get the interaction between dev and test we where hoping for.

We did get use for our testers for partner integration testing, which was manual (using rest client). But it wasn't really working well because they didn't know the interfaces and the application as they hadn't worked with th test automation.

We have been having this same experience for over a year. Our testers don't seem to be able to get involved in our test automation. But we do have a few who are and we are super lucky to have found them because we have not really had much of a clue when recruiting.

Our developers who are very modern and in a lot of cases super interested in test automation have constantly been bashing us about our choices of test tools. As I talked about in my previous post they totally refused SoapUI and arnt all that fond of Fitnesse either, even though they prefer it alot over SoapUI. But my take has always been "we need testers to feel comfortable with the tool, it's their domain, let them pick".

After we decided to move to REST assured I started to realize the problem I've failed to grasp for the six years or so I've been working with test automation. There are two sets of testers. GUI testers and technical testers. A GUI tester will always struggle with test automation. He/she has little to none experience or education in coding. The technical tester has a background as a developer or started developing as part of a automation interest.

Still even the technical eaters we have who have experience from test automation have had a transition period coming into the continuous world. Testers do tend to accept manual steps that arnt acceptable in a continuous world. It's not ok to just quickly verify this bug manually because its so simple and changing the test case that had a gap is a lot of work. Its not ok to just add that user into e DB manually. It not ok to verify against the DB.

In the past our demand in GUI clicking testers has been high and our demand in technical testers has been low. At least in our old nonCD world the TestPyramid, http://martinfowler.com/bliki/TestPyramid.html, was upside down.

The Continuous Delivery expansion will drive a shift in what we look for in testers. Our tester demographics will move towards matching the pyramid. We will still need the GUI testers but their work will move more towards requirements gathering and acceptance testing. While I think we will see a new group of testers, with a much stronger developer background, come in and work with the automation.

This new group of technical testers or developers with super high understanding of testing is still hard to find but I hope we will see more of them.

Picking the tool for Component Testing

As i discussed in the previous post we realized we need to test our component, not just unit test code and do functional testing. Biggest reason was that we where cluttering our functional tests with validation a that didn't belong at that level. Cluttering tests with logic that doesn't belong at that level was increasing our test maint costs, something we knew rom the past we had to avoid.

Since our application is based on REST services exposed by components with very well defined responsibility its very well suited for component testing.

The responsibility for our components was clearly defined. A component test is a black box test with the responsibility of validating the functionality and interfaces of the component. This definition was somewhat a battle as some how our testers seem (our project and others in our company) to insist that they want to validate stuff on the database. I don't like beeing unreasonable but this was one thing I felt I had to be unreasonable on, black box means get the f*** out of our database. I still feel a bit bad about overriding our test managers on this but I still feel strongly that it had to be done. So all tests go through public or administrative interfaces on our components. Reason for this is obviously that I wanted our tests to support refactoring and not break when we change code.

The upside on forbidding database access was that we had to make a few administrative interfaces that we could eventually use for building admin tools.

So now what tools should we use? We where using Fitnesse for our functional testing but Fitnesse biggest weakness is that it doesn't separate test flow from test data. With functional testing this isn't really a big issue as each test is basically a flow of its own. But with component testing and it by nature beeing more detailed we saw that we would get much more tests cases per flow. Another weakness is that Fitnesse doesn't go all that well together with large XML/json document validation.

We do our GUI testing with selenium Fitnesse combo and that we would continue doing. But for our REST service testing our first choice was SoapUI. We prototyped a bit and decided we could accomplish what we wanted with it so we started building our tests. This was the single biggest mistake we made with our continuous delivery process.

Back in the days when we just did functional tests for our deliveries in Fitnesse we had a nice process of test driven development. Our developers activly worked on developing testcases and fixtures and the tests went green as the code hit done. I really like it when it's functional test driven developement and think this is the best form of tdd. This went totally down the drain when we started using SoapUI for our component tests.

Our developers refused to touch SoapUI and started handing over functionality to testers for testing after they had coded it. This resulted in total chaos as we got a lot of untested checkins. Backloaded testing works extremely bad with a continuous delivery process. Especially since we didn't use feature flags.

This put us in an interesting dilemma. Do we choose a test tool that testers feel comfortable working with or do we choose a tool that developers like? I personally am very unreligious when it comes to tools. If it does the job then I don't care so much. But I've always had the opinion that test tools are for testers and its up to them, devs need to suck it up and contribute. Testers always seem to like clicky tools so I wasn't suprised that they wanted to use SoapUi.

We where sitting with a broken test process and our devs and testers no longer working together. Fortunately our testers came to the same conclusion and realized this tool was a dead end for us. The biggest killer was how bad the tool was suited for teamwork and versioning, even enterprise edition. Each and every checkin caused problems and you basically need to checkin all your suites for each reference you change. Horrible.

So after wasting nearly 3-4 months, growing our test dept and killing our dev process on SoapUI we decided to switch to RESTassured. For some of our components this is a definite improvement and its definitely a improvement process wise as our developers are happy to get involved. But I do still see some posible issues on our horizon with this choice. Though the biggest change is for our testers and how we as an organization view the tester role and that will be the topic of my next post.

One very nice thing though, our continuous delivery process is maven based so the change of tooling didn't affect it. Each test is triggered with mvn verify and as long as the tool has a maven plugins don't really care what tool they use for their components.

Tuesday, December 4, 2012

Automated System Testing

As I described earlier we decided that we wanted to prioritize automation of our functional system testing. The trickiest thing for us was finding the right abstraction level for theses tests. In precious projects where we have been working with Fitnesse as a test tool we have failed to find a good abstraction level.

We have always ended up with tests that are super technical in nature and cluttered with details that are very implementation specific. The problem this has given us is that very few people understand the test cases. Especially testers tend to have problems understanding them as they are so technical. If our testers can't work with our tests then we do have some sort of problem (quite a few and I will get to them later).

This time around we set out to increase the level of abstraction in the tests to make them more aimed towards testing requirements and less towards testing technology. We where hoping to achieve two thins, less maintainance and more readability. Maintainance budget for our automated tests had been high in the past. The technical tests exposed too much implementation and hence most refactoring resulted in test rewrite. This was shit because our tests didn't help to secure our refactoring. This was something we had to address as well.

The level we went for was something like this. (In pseudo fitnesse)

|Register user with first name|Joe|last name|Doe|and email|jd@test.mail.xx|
|verify registration email|

We abstracted out the interfaces from the test cases and just loosely verified that things went right.

This registration test could be testing registration through the web portal or directly on the REST Service. This did lower the maintenance drastically. It did give us room to refactor or application without affecting the test cases. Though we where still having trouble getting our testers to write test cases that worked and made seance. The example above is obviously simplified and out of context, our test cases where quite long and complex. The biggest problem was what fixture method to use and how to design new ones with right abstraction level.

This exposed the need for something we really hadn't thought much about before, test architecture. We where really (and still are) lacking a test architect. Defining abstraction layers, defining test components, reusability of testcomponents, handling of test data, ect. All these become dramatically wrong when done by testers alone and not good enough when done by developers either.

Another problem we ran into with this abstraction level was that our testers didn't know our interfaces. The test doesnt really care about the interface so the tester doesn't need to take responsibility for the REST interface. Imho the responsibility of securing the quality of a public REST interface should lay with the testers. But here we give them tools to only secure registration functionality. Yes they are verifying the requirement to be able to register but the verification of the REST interface is implicit and left to the fixture implementation. This is not good.

It also generated a huge problem for us. We had integration tests with our partners that consume our REST interfaces and our testers who where in the sessions dint know our interfaces. When should they have learned them this was the first time they where exposed to them. They also where required to use tooling that they dint use other wise to test, REST clients of what ever flavor.

We had solved one thing but created another problem.

Had we abstracted too much? Was this the wrong level to test on?

Before we analyzed this we started to compensate for these issues by just jamming in more validations.

!define METHOD {POST}
|Register user by sending a |REST_REQUEST|${METHOD}| with first name|Joe|last name|Doe|and email|jd@test.mail.xx|
|RESPONSE_CODE|200|
|verify registration email|

Still a fictive example but Im trying to just illustrate how it went downhill. So we basically broke our intended abstraction layer and made our tests into shit. But it got worse. Our original intent was that registration is registration regardless of channel, web portal or rest. But then we had to have different templates for different users. Say that it was based on gender. So we took our web portal test case and made that register a female user and verify the template by string matching and then we used the REST interface for the guys.

Awesome now we made sure that we got a HTTP 200 on our REST response, we made sure that we used pink templates for the guys and green ones for the gals. Awesome and we made sure to test our web interface and our rest interface. Sweeet!!! We had covered everything!

Well maybe not.

This is when we started to think again and our conclusion was that the abstraction layer of the initial tests was actually quite ok.

|Register user with first name|Joe|last name|Doe|and email|jd@test.mail.xx|
|verify registration email|

This tests a requirement. Its a functional test. It does have a value. Leave it at that! But it cant be our only test case to handle registration.

Coming back to this picture

Our most prioritized original need was to have regression test on functional system, end to end of our delivery. We had that. We also had unit tests on the important mechanisms of the system, not on everything but and yest too little but on the key parts. Still we where lacking something and that something was component/subsystem end to end tests. This was something we deep down always knew that we would have to make but we had ignored it for quite some time due to "other prioritize". I still think we made the right call back then and there when we prioritized but this has been the most costly call of our entire journey.

So we started two tasks. One retro fitting Component Tests and cleaning up our System Tests. More on how that went and how we decided to abstract these in the next post.

Friday, November 23, 2012

What is Continuous Regression Testing?

In my previous entry I said that our business case was Continuous Regression Testing. I will make a few entries on the subject.

Our definition was quite clear on what level of testing we wanted to achieve.

We define it at the System Test level, meaning regression testing of our end to end functionality as specified by our Use Cases. Important here is end to end and driven by Use Cases.

As an example say that our system would have a "Register User" Use Case which states that the user should be able to register through a portal and receive an confirmation email. In this case our end to end delivery is from our web front to our mail server. We would be creating an automated Test Case which registers the users through the portal and then the test evaluates that there is a mail sent to the email address of the registered user.

Me and others in our company (not on the team at the time) at the time have had quite a bit of experience working with automated System Tests. Yet we ran into quite a few mines (AGAIN!!) when building our test suites. I will cover these later.

So we had a clear understanding on how we wanted to test and we knew we wanted to test it often. For us testing on each code commit and so heavily relying on test automation was way out of our comfort zone. But our business case was strong and we knew why we wanted to go down that road.

Early on we made a few key decisions. One was to test everything on every commit. Second was to only build each artifact once. We discussed this and came to the conclusion that we would need to retest after a rebuild. We felt our process would be simpler and more correct by doing this.

Our architecture was based on a few well defined components (publishing rest services) that together formed our deliverable.

This is a simplistic view of something similar. Some components where internal and some where externally integrating.

We where aware that we most likely would need to test the components in a continuous way but we ignored it on purpose. Our focus had to be on our delivery as well and we all know too well that we need to prioritize well when working on deadlines. Its easy to get carried away on theoretical models and a desire to solve everything but its not the way we work. We always want something deliverable and then iterate it.

So our initial pipe ended up looking something like this. Each of our components where built, unit tested and released on commit. The our deployment assembly was updated one component at the time. Each time the assembly was updated we deployed it and tested it on our test server.

It was a very simplistic start to our Continuous journey but it was very adequate for us and most importantly it solved our primary concern Continuous Regression of the customer application.

Its important to remember that the application was new and our team was small at the time.

Thursday, November 22, 2012

Where to start with Continuous Delivery

A lot of people have already started the journey without really knowing it. Its few companies today that don´t utilize some key parts of a Continuous process. When reading books and listening to speakers at conferences its easy to feel totally stone age. But the key is to remember that the journey is baby steps towards a vision. But Continuous Delivery is just progression of Continuous Integration and a lot of the basics are built upon in the progression.

By realizing we have something we can start to think about what we really need out of a Continuous Process. In some sense we all want to release all the time and we want our artifacts to be server images deployed on cloud nodes monitored by super cool graphs of all the metrics we can possibly think of. That is all nice but we all do need to provide a, hate the word, business value. So where is the business value in a continuous process for our organisation. That is the question we should ponder before we start.

In fact we had didnt have to ask our selfs the question in that way. Instead we where provided a challenge. For us it was key to solve system regression testing. Test automation and lowering cost in regression testing is obviously one of the key reasons to have Continuous Delivery. For us the key business case was to solve Continuous Regression testing. Continuous what ever else would come as a bonus.

Knowing that Continuous Regression Testing and not Continuous Deploy was our main priority made it easy for us to prioritize our efforts. Our efforts went largely into Testability Architecture in our application, Test Architecture in our test automation frameworks and automating a pipe that would deploy and test each code commit.

So really as with any agile development find your business value, focus on delivering it with a minimal effort "good enough" approach and iterate from there.

Over the next coming posts I will go though what we focused om in order to achieve Continuous System Regression Testing.

What drives my interest in Continuous Delivery?

I want to mitigate the risk of my presence in a project.

My weaknesses as a individual both privately and professorially are that I'm horrible at following instructions I cant do it once but I'm even worse at doing it multiple times. I´m feel physically sick if I need to do repetitive tasks, I get demotivated and hence I do an even worse job. I also tend to forget to do things because I've got my mind full of other stuff. I'm also very bad at testing. Ive worked with test automation for nearly six years now and I still suck at good test cases.

What I am good at is creative problem solving, finding solutions to problems and thinking outside the box. I'm also quite good at inspiring people.

This is a horrible combination for any project. This guy who cant follow instructions constantly comes up with new "bright ideas", cant event test them and even worse he gets people to engaged and enthusiastic towards his crazy ideas, do we really want him on our team???

Yes I am a high risk and I need to be mitigated.

In all seriousness, I do love change, I thrive in an ever changing environment. I love the positive energy towards solving problems and not dwelling over them. I love high pace and new challenges. I also do know that in order to do what I love to do the "maintenance" of our work needs to be minimized as its a huge time and cost sink for each project.

The "maintenance" part isnt just our production applications. Its our code, our tests, our mechanisms and of course our production deliveries. Every hour not spent maintaining these is spent on bringing in new business value into the organization and solving fun new challenges.

If I can contribute towards lowering the onetime and run time costs of a delivery then Im satisfied because I know my work has really mattered. Im also satisfied on a personal level because I know that Im spending time solving new problems and not maintaining old problems.

I want to come to work and feel satisfied...

About Me

I´ve worked as a Java Developer/Architect for 15 years. I´ve worked as part of a consulting organization and as part of a line organization.

Over the last 6 years I´ve had an ever increasing interest in the quality of the delivery. Initially this interest lead me to work with automation of system tests. Then more and more towards automation release and deploy processes. Now for the last two years Ive focused alot of my work on the full Continuous Delivery process.

This blog will server as a collections of lessons learned from my work. Mostly just for my self but Im happy to share my experiences if anyone is interested.

Follow @TomasRihaSE

Pages