quarta-feira, 27 de março de 2013

Feature toggles: good or bad?

Everyone who have read at least once about continuous integration or continuous delivery knows that it is not possible or,  at least it is very difficult, to do if you use more than one development branches. If you have never ever faced any problems using two or more development branches you are a fucking lucky guy or, probably, you are not used to refactor your code as you should be (in these rare cases, I strongly recommend you to take a look in this Martin Fowler & Mike Mason interview and Jez Humber Continuous Delivery book). On the other hand, if you have found yourself in an endless merging task or if you have been feeling insecure about some merging results. Do not panic! Keep reading. You are not alone. :-)

As you may already know, there is no problem with branches. The problems are all about merges. So, to overcome big and error prone merges the best strategy I would recommend is Branch by Abstraction. According to Martin Fowler, Branch by Abstraction is “a term coined by Paul Hammant to describe a technique for making structural changes to a code-base without a Feature Branch in a source-code control system”. To make that real, we are used to implement new features, or updating existing ones, using a very simple technique known as Feature Toggles. In short, a feature toggle is a way you develop your software in order to make it easy to enable or disable features through configuration changes.

This kind of strategy is widely used by companies such as Facebook, Flicker, Google, Amazon, and, of course, the ones I’ve been consulting. A very simple example of feature toggles can be seeing in the Chrome web browser by typing chrome://flags in the URL input. In this page you can enable or disable features of your browse and help Google to evolve it. ;-)

After some years of experience trying to improve clients’ delivery processes, we have found some advantages and disadvantages of using toggles instead of version control system branches. So, the main idea of this post is to share and discuss our experience with the community.

The advantages we have found are:

  • Ability to turn features On and Off at any moment
    • If you have found yourself adding a new bug in production, for example, you could just turn your feature Off.
  • Move forward, never backward
    • Once you can disable incomplete or buggy features, you can move on and fix it instead of rolling back your entire app (which in some cases is extremely painful).
  • Ability to put unfinished code in production
    • At the first glance it may sound a bit weird, but it helps you to reduce the amount of stuff you put in production at once. Small batches reduce the risk of going live (see Reducing IT risk using continuous delivery )
  • Reduce or eliminate the necessity to create branches in a source-code control repository
    • Your integration team will be very thankful, believe me ;-). This means less or no merges any more. Given them more time to making smarter tasks than merging.
  • Ability to make experiments that could not be done outside of production environment
    • For example, those intermittent issues which are impossible to reproduce outside of production.
  • Ability to enable a feature for only one group of users
    • For example, you can enable a feature for a specific region, role, user group, or any other rule you need.
  • Ability to enable and disable the features while the app is up and running
    • But you may set toggles at build time

The disadvantages are:

  • There are cases in which it is not possible or, at least,  it is very difficult to implement a feature toggle
    • Examples in the Java world
      • Static configuration files (i.e. web.xml, ejb-jar.xml e persistence.xml),
      • JPA mapping,
      • etc.
  • Every toggle requires an engineering process
    • First, avoid doing toggles for everything.
    • Second, you must be sure you need it.
    • Third, if you realize it is extremely necessary, it is important to provide a good design for its implementation.
  • Toggle On means: a pretty new technical debt
    • Once you enable your toggle and it is working 100% in production you have a technical debt that should be removed as soon as possible
    • If you do not remove the technical debt your software will be harder to improve, maintain, test and deploy. This happens because your code must care about all possible situations (i.e. toggle On and Off)
  • Features loosing priorities
    • Sometimes people forget to enable a feature toggle just because that feature loses its priority. It is very ease to happen, unfortunately. At least in my experience.
    • Sometimes people find an issue in production and then disable the toggle, but never fix it. In this case, the toggle remains there disabled forever.
  • Feature toggle could easily get out of control
    • Sometimes people lose the control of what is enable in which environment
    • The number of toggles will grow faster than you think. Take care.

Finally, the discussion about this subject

Any technique has its advantages and disadvantages. However, if you have noticed, most of disadvantages are a reflect of not given the right control feature toggles should have. With a good toggle management, frequent releases and focus (do one thing at a time) it is possible to solve most of these issues.

To help overcome these difficulties, we have been fulfilling the following information for every feature toggle:

  1. The Value
    1. Usually enable/disable, but can be a complex data structure informing the region it is enabled, the user grup, or etc
  2. A detailed description
    1. Detailed information about what this feature toggle does and why it is necessary
  3. The deadline (very important)
    1. When this toggle should die and how you should perform the clean up
  4. The responsible (very important)
    1. The engineer/team responsible to monitor and remove the toggle. Usually, this responsibility is given to the one which have chosen to add it.

Besides that we have developed a dashboard, which is revised periodically, which provides all toggles information in order to get the full benefits this technique could provide us. Moreover, as Martin Fowler already said “while feature toggles are a valuable tool in the box, they are a second-best option. The best thing to do with such features is to find a way to gradually release them into production as you are building them”.

If you like or not, please leave your comment. I will be very happy to listen to other teams experiences.