Product Development at Velocity.
If you can’t continuously improve how you’re learning from customers and continuously improve the teams inside your organisation, you can’t ship continuously improved products.
The Principle of Continuity
I will use a very simple example to frame our discussion. Let’s say that your software development team believes that the customer wants an improved red button that’s square instead of round. Okay, development of an improved red button proceeds and is completed in one week. But, at the same time, the team creates an automated test for the red button as well as clear metrics for determining if the red button is actually the problem. This could be as simple as an A/B test (7 A/B Testing Resources for Startups and Solo Developers) that checks which button is used more.
The idea that the button needed changing is an hypothesis until it has been tested with customers. Imagine that the processes are not currently in place to enable the development team to receive these metrics — many teams are kept in splendid isolation from the results of their work — then the establishment of this relationship, perhaps with the Web operations team, is also going to be required.
This is as much a requirement of successful deployment of these features as the features themselves. The focus on the delivery of product features is not more important than developing the team capability to understand the success of the features once deployed. Following directly from this is the insight that the capability of the team in general precedes the capability of the product being developed. Therefore, we need to ensure that our teams are improving continuously too.
Continuous improvement implies (a certain amount of) learning continuously. In agile development we use retrospectives and blameless post-mortems to reflect on the causes of failure or success and what we need to do to continuously improve.
This is an example of what Argyris would call double-loop learning. A simple example from Chris Argyris explains the concept of ‘double loop learning’;
“ A thermostat that automatically turns on the heat whenever the temperature in a room drops below 68°F is a good example of single-loop learning. A thermostat that could ask, “why am I set to 68°F?” and then explore whether or not some other temperature might more economically achieve the goal of heating the room would be engaged in double-loop learning.”.
Continuous improvement of teams fundamentally requires the creation of disciplines that support double-loop learning both internally and externally. To consider the entire software development process as a mechanism of learning is not putting it too strongly. We build something (small) so that we can see if it works, both technically and for our customers. To do this we have to always ensure that we are releasing and not creating dams of deployed but unreleased features. The main insight we receive from release is the learning that shapes what we build next. Taken together with our earlier discussion of value, this results in “The Principle of Continuity: We Build to release, we release to learn, we learn to build.” But what does this imply for how we build products?
The IOTA Model
From the preceding insights we are able to identify two tensions that frame our product development activities:
A tension between what we value and what the customer values.
A tension between how we think about the world (of the customer, our team etc.) and how we act in it.
These are shown in the model above as a matrix, identifying 4 stages in a cycle of product development.
Capability building → essentially a list of features / stories.
Condition improvement → a list of stories that do not impact the customer now but improve the teams ability.
The Two Product Principle
I think of the product we ship to the customer as “Product 1” — this is all about realizing our hypotheses of what we think the customer wants. But it isn’t practical action until we have actually delivered it and received feedback from the customer. We find Product 1 in the Capability quadrant. Everything we build should be a capability we hypothesise will be valuable to the customer and our organisation. Meanwhile, the internal capabilities we wish to develop — which I call “Product 2” — are our view of what we need to improve while we build. . These capabilities are, therefore, not theoretical in the same way as “Product 1.” We can see their effectiveness, and our effectiveness at building them, the very minute we decide to start improving them. “Product 2” takes hold — is at hand — immediately.
But how often though do we release software features that were ‘urgent’ but whose effectiveness is never actually measured in the end? Every feature should be measured.
What could we do with the results of our measures? These inform the next cycle (or sprint) where we either build on our success at accurately providing for the customer or decide whether to remediate our mis-calculation. Worked Example
A great example of this that will be familiar to software practitioners in banks everywhere will be the ‘budgeting application’ which many banks have built into their online offerings. The hypothesis was that customers needed help budgeting. This is almost certainly true but the means in which this was tested was in most cases by providing a budgeting solution and then measuring its use. This, incidentally, is very low across the board.
Measuring the ‘use’ of software is not the same as measuring value. If we do not have an hypothesis about the results this new feature will produce for our company, why are we investing resources in it in the first place? The IOTA model helps us join all of the pieces together. The example below provides an example of what a quick 2 week sprint might look like
- Customers want to budget using historical data from their bank account rather than control future spending (which is how most people understand the term, ‘budget’.)
- Customers understand how they spend money by categorizing their transactions.
- Capabilities (features):
- Create an alert that notifies whenever spend in a category goes over a budgeted amount. E.g. $100 from Amazon.
- Enable the categorisation of transactions.
- Obtain from the data warehouse team any attributes they already use to analyse transactions so that we can provide categories for users as a starter set.
- Determine if we can deduce categories of things bought using the amazon transaction reference.
- Within a month 5% of internet banking customers have categorized some transactions.
- Greater than 75% of those customers have activated a budget alert.
Iterating the model
If the targets are not met it automatically results in a questioning of the entire cycle from the start (hypotheses) again. In the next sprint, we may decide to further refine our target because we got mixed results — for instance, 20% signed up but only 10% used alerts. Notice also the conditions (Product 2). These actions will not necessarily improve the team if the basic hypotheses are false, i.e. if customers really do not want to budget using their online banking service, but they may still be useful. The use of product 2 to investigate deeper and probe outside of the current scope of the hypotheses can be a useful means of not falling into the disruption trap.
To see why this is the case, consider that a bank wanted a budgeting application because they were afraid of disintermediation by apps like Mint.com. The response was to create some similar features as shown above. After the results above, the bank decided to test the ability to decline transactions by SMS based on the budget categories and implemented an MVP that enabled this only for Amazon purchases.
On release, both the adoption of categorization and alerts spiked sharply. For those in banking there are many variables in this but for the purposes of elaboration here, it is clear that the ability to control the transaction seems important to the value perceived by the customer. The basis for disintermediation being flashy graphical features (like budget pie charts) seems less likely than transaction control. In the next cycle we are forced to look at other transaction types or perhaps payment methods (e.g. paypal) that are more transparent. In this way we use the customer to investigate value more openly. The basis for this will often be those conditions built up in product 2 activities.
I have however also seen many Product 2 examples that are very mundane, like getting more powerful laptops for the team so they can run full local testing or writing automated tests for dependent API’s. Either way, ensuring that Product 2 features are given the same priority as product 1 features means that over time the team improves the conditions within which it works, as well as its understanding of the broader ecosystem within which the product functions. Product 2 features should be written like user stories, put on your scrum / kanban board and estimated / allocated in the same way as customer features.
The IOTA model can be tricky to get to grips with conceptually. This is why you shouldn’t try. Just start using it. I use a mechanism, I call ‘The Crucible’ to enable this.
We discussed double-loop learning above and there are many approaches to formalising it including in very testing circumstances. A common one is the OODA loop used in the military: Orient, Observe, Decide, Act. and of course, Demming’s PDSA — Plan, Do, Study, Act.
In order to work effectively and make good decisions, we know we need to reflect on results but we also need rest. The human brain’s ability to concentrate on a single item begins to decline after around 10 minutes. In addition, if we believe we have plenty of time to make a decision we also tend to spend more time talking abstractly, without a clear outcome in mind, also resulting in many context shifts as the conversation meanders. This is wonderfully covered by Goldratt in ‘The Goal’ as ‘Student Syndrome.’ So to remedy this, I break up each iteration of the IOTA model into 15 minute slices where we focus on four activities within strict time slices:
A. 2 Minutes to decide what we are going to model this iteration.
B. 5 minutes to do one cycle of IOTA using stickies on a flipchart. Ensuring we have added at least one to each quadrant.
C. 5 minutes to reflect on ‘how’ we approached the problem, not the solution itself, but how we worked together to get to the result we have. Think of this as a small retrospective.
D. 3 Minutes rest. This is critical as it gives time for the brain to process questions in the background using different cognitive pathways. This is why after taking a moment we often arrive at a solution.
In my experience, having run this 100’s of times, the best results are achieved after around 10 iterations like this. Each iteration is a refinement of the previous one and in a product development context enables a small team standing around a flipchart to become crystal clear on what they need to build to test an hypothesis and what success looks like.
In many cases, teams use story cards for the Capabilities quadrant to do sprint planning for a short sprint of around two weeks. The use of short sprints no longer than two weeks is very powerful because it reduces the focus on illusory estimation. This approach is often referred to as the “No Estimates” movement and is well described by Vasco Duarte. I’m a big fan.
I use the resulting model as a mechanism for talking to stakeholders outside the project team and the beauty is that everybody in the team can do the same thing. It is also useful as a context setter for retrospectives at the end of the sprint and to remind ourselves of our priorities during the sprint.
The HSM — Architects, Vision and other Horizons
The IOTA model is also very scaleable. A question often asked by agile teams in transition concerns the role of the architect and how to reconcile technical vision and strategy with product vision and strategy. If both of these are ‘well defined’ do we really have any choice as we iterate? The solution of course is for architects to describe the boundary of the technical solution, especially those elements that are not under your control. I particularly like Eric Evans approaches in Domain Driven Design for this. Some discussion of design principles going forward can be useful but the best design principles emerge out of real problems at the time of software construction and not in anticipation of them. There are of course some basic ideas that we might want to agree on. The 12 Factor approach is popular for instance. But what of the product vision?
A product vision further out than around 3 months is likely either too abstract to be useful or too constraining to allow learning and experimentation. Remember that ‘product ideas’ are not the same as ‘backlog’. In enterprises, backlogs that are three months worth of work can very quickly become assumed commitments of exactly that work. I define a Holy S**T Moment, or High Stakes Message if you prefer, as that customer experience or EVENT (we will come back to this word) in using your software that truly surprises and changes their perception of what was possible. These are tough to deliver and always end up being subtly different from the brilliant idea you had in your bath…as you iterate and explore.
The IOTA model can be used to describe the HSM (as a capability) and provide a focus on the high level hypotheses and tests on which the idea of the HSM is based. Perhaps we thought that budgeting would be a HSM for our customers but it turns out that discovering how much money they can save by declining transactions was the HSM. Having a clear focus on an HSM though, which you test after every sprint, is critical though because you can very easily slip into the detail of sprint planning while making massive assumptions about the overall impact of the collected features on your customer. In the end therefore you review your sprint IOTA and your 3 month IOTA regularly. This approach works well together with Jeff Patton’s Story Mapping if you need the additional structure.
How could we use the model of value discussed earlier, to improve how we build product? Both Jeff Patton and Marty Cagan have some great ideas for how to ask good questions of the products you are creating. I am beginning more and more however to come back to forcing reflection on how the product features we deliver in each sprint cover the 5 bases above. From a product development point of view we may translate them thus: • Velocity → Which jobs the customer has to do are now possible faster with the same quality?
• Granularity → What deep capability are you building, all of which might not be visible to the user, but which will enable you to unlock further features later? One of my favourite examples of this is the introduction of Python to configure the game Civilisation IV.
• Density → What more complex jobs are possible taking all features already delivered into account?
• Concreteness → What can the customer now do that previously they could not.
• Agency → What choices do you give the customer in how to perform a task?
Using Cagan’s language — all of these can be ‘delighters’ and using this approach I think you are more likely to identify that delight. Good news. These questions also work for Product 2! For clarity, I phrase them slightly differently:
• Velocity → What capability does the team need in order to accomplish a recurrent task more quickly?
• Granularity → Which product or technology areas do you need to acquire greater knowledge?
• Density → Which tasks can be integrated and performed together (I.e. Automated)
• Concreteness → What will the whole team be able to do at the end of the sprint that they are not able to do now?
• Agency → What capacity do you have in the current sprint to experiment?
Products developed using IOTA and the Crucible create EVENTS in both customer and team members’ lives.