software development

Test Trade-Offs

TL;DR: Software developers often decide what tests to write based on technical considerations. Instead they should decide what tests to write based on what feedback is missing. The Test Trade-Offs Model can be used to make better decisions around what tests to write. Useful dimensions to use when deciding what type of tests you should write are primarily: speed of feedback, coverage and variation.


The current thinking in the software development industry is to have a lot of low-level unit tests, fewer integration tests, and even fewer higher-level tests like system or end-to-end tests. The Test Pyramid, shown below, is a common model used to describe the relative amounts or ratios of the different types of tests we should aim for.

Traditional Test Pyramid

This kind of thinking generally focuses on how quickly the tests run – i.e. speed of feedback – and also how easy it is to write the different types of tests. Both of these are technical considerations. The problem I have with this thinking is it ignores the primary reason we have tests in the first place – to get feedback about our system. If technical considerations govern the types of tests we have, there may be a large number of tests we will never write, and thus a lot of feedback we’re not getting. For example, having lots of low-level unit tests doesn’t give us any information about how the system works as a whole. Evidence of this phenomenon is the multitude of memes around unit testing not being enough. Some of my favourites (click the pictures for the original Tweets):

Unit testers be like

Focusing on technical considerations only leads us to make blind trade-offs: we’re not even aware of other dimensions we should be considering when deciding which tests to write. The Test Trade-Offs Model was developed so that teams trade-offs when deciding which tests to write, by making other trade-of dimensions explicit. The model is predicated on the idea that different tests are valuable to different audiences at different times, for different reasons.

The dimensions currently in the model are:

  • Speed: How quickly does the test execute? How long do we have to wait to get the feedback the test gives us?
  • Coverage: How much of the system (vertically) does the test exercise? In general, the higher the coverage, the more confident we are about the behaviour of the system as whole, since more of the system is being exercised. Coverage is also known as scope or depth.
  • Variation: How many near-identical variations of the test are there? E.g. if a test has lots of inputs, there may be very many combinations of inputs, with each combination requiring its own test. (This article is useful for more on this idea.)

In an ideal world, our tests would execute instantaneously, cover the entire system, and would deal every combination of inputs and states as well. Therefore, the ideal test would score very highly in all dimensions. Unfortunately this is not possible in the real world since some of the dimensions have an inverse affect on others. The image below is a causal loop diagram showing the causal relationships between dimensions.

Causal Loop Diagram
  • An increase in Coverage generally leads to a decrease in speed of feedback. This is because the more of the system covered by the test, the longer the test takes to run.
  • An increase in Variation typically leads to a decrease in coverage. With high variation, there is usually a very high number of tests. If the suite of tests is to complete running in a reasonable timeframe, we usually decrease the coverage of these tests.

As the model shows, no test can ever maximise for all dimensions. Any test will compromise on some of the dimensions. We therefore need to choose which dimension to prioritise for a test. This is the trade-off. Each test should prioritise one of the dimensions. The trade-off of priorities should be based on what feedback about the system we need.

For example, if we need tests that give us information about the behaviour of the whole system, which will be valuable for a long time, we’re most likely willing to compromise on speed of execution and variation. The trade-off is now explicit and deliberate. Traditionally we would have ruled out such a test immediately because it would take too long to run.

The way I’d like to see the model being used is for teams to decide what system feedback they’re missing, decide what trade-offs to make, and then what kind of tests to write.

I believe this to be the the first iteration of the model, I expect it to evolve. I’m certain there are other dimensions I haven’t yet included, perhaps even more important dimensions. What dimensions do you use when deciding what type of tests to write? What dimensions do you think should be added to the model?

Acknowledgement
I would like thank Louise Perold, Jacques de Vos and Cindy Carless who helped me refine my thinking around this model and who helped improve this article.

software development

The States, Interactions and Outcomes Model

TL;DR: The States, Interactions and Outcomes model provides a way for cross-functional teams to collaboratively explore, specify and document expected system behaviour.


Specification by Example (SbE) and Behaviour-Driven Development (BDD) can be an incredibly effective way for teams to explore and define their expectations for the behaviour of a system. The States, Interactions and Outcomes Model provides a set of steps, and a lightweight documentation structure for teams to use SbE and BDD more effectively. The best way of conveying the model is through a worked example.

Worked example
To demonstrate the model and the process, I will take you through applying it to a problem I use frequently in coaching and training. Imagine we are creating software to calculate the total cost of purchased items at a point of sale. (This problem is inspired by Dave Thomas’ Supermarket Pricing Kata and here.) You walk up to a till at a supermarket, hand the check-out person your items one-by-one, and the checkout person starts calculating the total of the items you want to purchase. The total is updated each time the checkout person records an item for purchase.

We would like to include a number of different ways of calculating the total price for purchased items, since the supermarket will want to run promotions from time to time. Some of the pricing methods we would like to include are:

  • Simple Pricing: the total cost is calculated simply by adding up the cost of each individual item recorded at the point of sale.
  • Three-for-Two Promotion: By three of any particular item, pay for only two. This promotion is specific to the type of item being sold. For example, buy three loaves of Brand-X bread, pay for only two.
  • Combo Deal: A discount is applied when a specific combination of items is purchased.
  • Bulk Discount: A discount is applied when more than a specific number of a particular item is purchased.

In this article I will deal with only ‘Simple Pricing’ and ‘Three-for-Two Promotion’. I will deal first with ‘Simple Pricing’ completely, and then start with ‘Three-for-Two Promotion’.

Simple Pricing

  • System boundaries: We are concerned only with the way the total for the purchased items is calculated. We are not concerned with things like how the cost of an item is acquired (e.g. barcode scanning), accepting payment etc.
  • Types of inputs: For Simple Pricing, the only input is the price of the item being recorded – item price.
  • Types of state: What affects calculating the total price besides item price? For Simple Pricing, the total after recording an item – the new total – is determined by both the price of the captured item, as well as the total before the item is captured. Therefore state consists of current total.
  • Outcome dimensions: For Simple Pricing, the outcome consists only of the total calculated as a result of capturing an item – new total.
  • Possible values for state types: Current total is an integer, which can be negative, 0, or positive.
  • Possible values for inputs: Item price is an integer, which can be negative, 0, or positive.

Expected outcomes for combinations of state and inputs:

State Interaction Outcome Scenario Name
Current total Capture item that costs New total Error
0 0 0 Free first item
0 10 10 First item
10 10 20 Second item
0 -10 ERROR – item price can’t be negative First item with negative price
10 -10 ERROR – item price can’t be negative Second item with negative price
10 ABCDEF ERROR – invalid input Text input

Three-for-Two Promotion

  • System boundaries: The system boundaries don’t change compared to Simple Pricing.
  • Types of inputs: For Three-for-Two Promotion the type or name of the item is now also required as an input – item type.
  • Types of state: The outcome is now also affected by two other types of state: the types of items already captured – already captured items; and the type of Promotion currently active – Active Promotion.
  • Outcome dimensions: For Three-for-Two Promotion, the outcome consists of new total, as well as the new list of items that have been captured – new captured items.
  • Possible values for state types: Current total is an integer, which can be negative, 0, or positive. Active Promotion is a complex type. It can be ‘none’ or a promotion for a specific type of item, e.g. ‘Buy 3 Cokes, pay for 2’.
  • Possible values for inputs: Item price is an integer, which can be negative, 0, or positive. Already captured items specifies the quantity and types of items already captured.

Expected outcomes for combinations of state and inputs:

State Interaction Outcome Scenario Name
Active promotion Current total Items already captured Capture That costs New total New captured items Error
20 2 Cokes Coke 10 30 3 Cokes 3rd item with no promotion
Buy 3 Cokes pay for 2 20 2 Cokes Coke 10 20 3 Cokes 3rd qualifying item with 3 for 2 promotion
Buy 3 Cokes pay for 2 20 1 Coke, 1 bread Coke 10 30 2 Cokes, 1 bread 3rd item doesn’t trigger promotion

There are several interesting things about the specifications above to which I’d like to draw particular attention:

  • All the words and concepts used are domain-level words and concepts. There are no implementation or software-specific words.
  • The specification describes the transactions and outcomes only, not how the work should be done.
  • The things that determine the outcome of a transaction are super-obvious and explicit. This makes it easier to detect and discuss edge cases.
  • Invalid states and interactions are easy to see.
  • The path to any particular state is clear and obvious
  • Should we want to, it would be easy to automate the verification of a system which should satisfy these specifications.

As mentioned above, I developed and use this model during my coaching and training. It has proven very effective for quickly exploring and documenting system behaviour. In some BDD Bootcamps, we have explored and specified legacy systems running in productions in about 3 hours. One of the ways this has proven useful is people in the bootcamp who have not worked on those particular systems gained a very thorough high-level overview of the intention of the system.

The worked example above follows these steps:
1. Explicitly define and bound the system under specification. What is included, what is excluded?
2. What are the different inputs to the system?
3. What are the types of state that the system can have? Another way to ask this: Besides the inputs, what can affect the outcome of an interaction?
4. What constitutes system outcome? Is any output returned to the user? Note that an outcome must, by definition, include all states as identified above. Outcome can also include error conditions.
5. For each type of state, what are the possible values?
6. For each type of input, what are the possible values?
7. For each combination of state and interaction, what is the expected outcome (including all dimensions)?

The Thinking Behind The Model
The idea behind the model is that the outcome of a system interaction is a function of the interaction and the state of the system at the time of interaction. We can develop a complete and comprehensive specification of expected system behaviour by describing the expected outcome for every possible combination of state and interaction.

Specification by Example and Behaviour-Driven Development
The model and the steps are largely based on the concepts of Specification by Example and Behaviour-Driven Development. Specification by Example (SBE) is the practice of specifying expected system behaviour using concrete values instead of natural-language descriptions. For more on Specification by Example,you can’t do better than Gojko Adzic’s book. Behaviour-Driven Development (BDD) uses SBE. One of the reasons I use SBE is that it allows us to work with something tangible, instead of ‘invisible ideas’. Some of the benefits of using BDD and SBE are:

  • Getting feedback on the work from a wider audience earlier in the process.
  • Making edge cases more obvious.

Ordinarily, we would need to write some software to achieve these things. By using BDD and SBE we can get these benefits before writing any software. However it is not always easy to get started with these techniques.

A common challenge teams face when they start using BDD and SBE is the need to make every aspect of expected externally-observable system behaviour completely explicit. That is, all the factors which affect the behaviour of the system must be identified and made explicit. If any of these factors are missing or unknown, we cannot specify expected system behaviour completely and comprehensively – we will have gaps. It is difficult to develop a successful software product if there are gaps or inconsistencies in what we expect the software to do.

Understanding systems
The steps above are designed to help a team understand the system they’re dealing with. The simplest way we can understand the behaviour of a system is as a simple transaction: some entity is stimulated or exercised in a particular way, and the entity does some work. The simplest way of modeling a transaction is by stating that the input to a system determines the output.

input-system-output

In this view, the system output is determined only by the input to the system. I have come to use the terms ‘Interaction’ and ‘Outcome’ instead of ‘input’ and ‘output’ respectively, because they are closer to the way most people think about working with software products: “I interact with a system to achieve some outcome”.

interaction-system-outcome

However, it is important to understand that the outcome of an interaction with a system is determined not only by the interaction, but also by the state of the system at the time of the interaction.

state-interaction-system-outcome

The introduction of state into the picture often causes some challenges. The first challenge is differentiating between interaction and state. The easiest way to distinguish between them is by asking the question What determines the outcome of an interaction besides the input?.

The next challenge is understanding that system state is generally not described by a single value. System state is typically made up of multiple dimensions or types, and therefore must be expressed as a set of concrete values, one value per dimension. The same applies to values supplied to the system as part of an interaction.

Once a team begins thinking in terms of states, interactions and outcomes, they’re generally able to have more effective conversations around what behaviour they expect from their system.

Facilitation, Open-Space

Lean Coffee: Lessons Learned Hosting & Facilitating

TL;DR: Lessons learned from hosting and facilitating Lean Coffees at public meetups, within companies and at conferences.


I have been hosting and facilitating Lean Coffee for a number of years now, for the Lean Coffee JHB Meetup, at companies where I’ve worked, and at several conferences I’ve attended. Along the way I’ve learned a few things about hosting and facilitating. Quite frequently people ask me for advice on starting a Lean Coffee. These are the lessons I’ve learned.

Lean Coffee
Lean Coffee is a lightweight structure for an informal gathering where the participants decide the agenda at the start of the gathering in a just-in-time way. The aim is to have many shallow discussions about a broad range of topics instead of deeply discussing only one or two topics. All you need for a Lean Coffee is somewhere to gather (hopefully with either good coffee or beer), someone to invite people, and someone to facilitate (after a while the gatherings become self-facilitating).
The discussion around each topic is time-boxed, so that we don’t spend too much time on a single topic. Lean Coffee was originally developed to discuss Agile/Lean coaching, but you can discuss anything. You as a host decide if there are boundaries to the topics which can be discussed.

Hosting
Its very easy to host a Lean Coffee, all you need to is:

  • Secure a space for the event. This may or may not need to be booked. For public events, we usually just show up at a coffee shop, sometimes without warning them. The venue should be appropriate for the number of people you’re expecting (see Group Size below).
  • Invite people. Get the word out there. Most public Lean Coffee groups I know of use meetup.com which works well. For conferences, getting Lean Coffee added to the official conference program works well, as does Twitter, and getting conference hosts to announce it.
  • Secure stationery for the gathering. This might entail bringing it yourself or asking other people (or a sponsor) to provide it. Generally all you need is a few pads of sticky notes, and some pens and/or markers. If at a conference, there’s likely to be some stationery around that you can use. You may want a whiteboard or flipchart too depending on type of venue and number of people (see Facilitating below).
  • Make sure there’s a supply of good coffee and/or beer. Its ok for people to pay for their own coffee at a public Lean Coffee (sponsorship is always great though). Try make sure there’s decent stuff.

Facilitating
Facilitating means you’re holding the space and managing the flow of the event. You can think of it like directing the traffic at the gathering. Facilitating typically includes the following:

  • Dealing with any rudeness, conflict, hostility, high emotions etc if they occur. We’re allowed to disagree but we’re not allowed to disrespect each other.
  • Create a flow board (otherwise known as a Kanban board or Personal Kanban board), with ‘To Do’, ‘Doing’ and ‘Done’ columns. The board is used to keep track of where we currently are in the event.
  • Explaining the flow of Lean Coffee (see Flow below). This includes when you will retrospect and end. I have a set of laminated cards I use on the flow board to demonstrate the flow.
  • Moving the cards and topic sticky notes on the flow board as the event progresses.
  • Mention the Sticky Note Tips (see below) before the brainstorm.
  • Reading out the proposed topics one by one, in any order you like, prompting the proposer to pitch their idea.
  • Explain dot voting procedure when its time to vote for topics.
  • Read out the new topic again at the start of each new discussion
  • Keeping time and indicating when its time for the silent thumb vote (see Flow below).
  • Deciding whether to keep the current topic or move on to a new one based on the number of votes.
  • Moving on to a new topic if discussion dies down before the timebox has ended.
  • Making sure that once a new topic has started, people stop discussing the old topic and move on to the new one.
  • Stopping timeously to retrospect and running the retrospective.

Flow
The flow of Lean Coffee is:

  1. Explain: Explain what Lean Coffee is, especially for first-timers. The Lean Coffee section above should be useful for this. Use the points below to describe the flow.
  2. Introduce: Everyone gives a 20-second, 2-sentence introduction of themselves. The content of the introduction should be germane to the theme of the Lean Coffee, especially if it is bounded. You might want to skip this step if there are a lot of people at the gathering.
  3. Brainstorm: Each person, silently, writes a brief summary of the topics they’d like to discuss on a sticky note (see Sticky Note Tips below). A topic can be anything, usually a question the proposer has, or something they’ve been thinking about. Unless there’s a theme, I tell people they can propose anything, since the group will vote on what they want to discuss anyway.
  4. Pitch: The facilitator goes through the proposed topics one at a time, sticky by sticky, in whatever order they like, and reads out the topic summary. The topic proposer then gives a quick spoken pitch of the topic. The pitch should be just long and deep enough to allow everyone to vote on which topics they want to discuss.
  5. Vote: Everyone votes on which topics they’d like to discuss. Voting is done with dots (i.e. dot voting). A vote is created by making a dot on a sticky with a pen or marker. The vote doesn’t count unless the person says ‘ping’ loudly while marking the sticky. Everyone gets a certain number of dots for voting, typically 2 or 3. The number is determined by the number of participants.
    A person can distribute their votes in any way, all votes on a single sticky, or one each on as many stickies as there are votes. Topic proposers can vote for their own topics. Once voting has finished, the facilitator orders the stickies in most to fewest votes, and places the stickies in the ‘To Do’ column of the flow board. The facilitator decides how to handle stickies with the same number of votes.
  6. Discuss: This is the meat of the event. The group start discussing the topics, starting with the topic with the most votes. The topic proposer starts the discussion, usually with the summary, and maybe a little more depth. The discussion continues until a time limit is reached. The time limit is usually decided by the facilitator or group at the beginning of the event. I usually use 5 minutes. I have also heard of 8 minutes and 15 minutes.
    Once the 5 minutes are up, a silent thumb vote is held, where each person signals whether they think discussing the current topic is valuable, or a new topic should be chosen. Intent is shown by a thumbs-up or thumbs-down gesture. People may abstain by using a horizontal thumb. It is important that everyone understands what thumbs-up and thumbs-down means, since different Lean Coffee groups have their own conventions. My convention is: thumbs-up = I’m still interested, same topic; thumbs-down = I want a new topic. The idea of voting with thumbs is that it is silent and doesn’t intrude on the discussion.
    If there are enough ‘same topic’ votes, the discussion for this topic is extended, but is again timeboxed. My ‘extension’ timebox convention is 3 minutes. I’ve also heard of 5 minutes. I’ve also heard of not allowing extensions. My convention is to allow up to 2 extensions (with another silent thumb vote after the first extension). If a discussion is really animated and entertaining, with shouting and fist fights, I may decide to allow a discussion to continue as long as I want. You can decide on how many extensions you want to have. Once the discussion of a particular topic has ended, whether through ‘all extensions used’, enough ‘new topic’ votes, or if the discussion ends before the end of the timebox, the next topic is discussed. If you run out of time or topics, then you start the retrospective.
  7. Retro: How was the gathering? What can be improved next time its run so that it will be a better, more valuable, more rewarding experience? A lot of the content of this article has come from retrospectives.

Sticky Note Tips
There are a few things to remember which improve the usage and utility of sticky notes. They seem simple, but they’re easy to forget. I’ve made all of these mistakes at one time or another.

  • Write on the non-sticky side.
  • Write with the sticky strip on the top of the sticky.
  • Write only one idea per sticky.
  • Don’t peel stickies off the pad, pull or sheer them off. They stick better this way. Peeling leads to curling.
  • Write large enough so that the writing can be seen from a reasonable distance.
  • Try write with a colour that can be seen on the sticky colour.

Group Size
Group size is probably the factor that affects the hosting and facilitating of Lean Coffee the most.

  • Fewer than 4 people: When there are so few people, I usually discard the Lean Coffee structure and have a regular conversation. At this size venue doesn’t really matter.
  • Up to about 8 people: The public Lean Coffees I attend are usually around this size. Everyone can fit around a single table or a few small tables pushed together. Sound doesn’t need to travel as far. For these reasons venue is not too important at this size. The flow board can be created on the table(s) around which everyone sits.
  • Up to 20-25 people: This size is quite typical for conference and internal Lean Coffee. At this size it becomes difficult to fit everyone at a single table, and for everyone to see the flow board if it is on a table. Therefore you’ll probably need either a large room like a boardroom with a large table, or a room with several tables. At this size you want a room with doors that can close, especially if the venue is adjacent to a noisy area, like the registration or refreshments area at a conference. This is because everyone’s voice should be within hearing everywhere in the room.
    At this size, its not feasible to have the flow board on a table, so I usually use a convenient wall or whiteboard for the flow board. At conferences I usually use a flipchart for the flow board, and a second flipchart to hold the topic stickies before, during and after voting.
  • More than 20-25 people: I don’t have much experience with this group size. A pattern I’ve heard about a few times is splitting the large group into two or more smaller groups, which have their own discussions then ‘reconvene’ afterwards and brief each other on their discussions and learnings.

Internal Lean Coffee
For internal (private company-specific) Lean Coffee, its a good idea to announce at the beginning that everything said in the event is private and confidential and shouldn’t leave the room. This is to make a safe space so that people can open up more.
At one company we tried a few invitation mechanisms:

  • Single invitations for recurring meetings scheduled through Outlook/Exchange. Easy to create and administer, but declined once and forgotten about forever.
  • A ‘private’ group on meetup.com. This didn’t really gain any traction.

And, eventually,

  • A single invitation email per event, sent about a week before the time. This was a bit more admin, but I solved that by copying and pasting the email content and mailing list. This option worked the best. Be warned that people might get tired of your ‘spam’ 🙂

At this same company, people saw it as a something for people involved in software development only, and thus didn’t arrive. It turned out the name ‘Lean Coffee’ was the main reason behind this. I changed the name to ‘Coffee & Conversation’, and in my emails explicitly mentioned that it was for everyone. This increased attendance and diversity.

For internal Lean Coffee, try get your company or department to sponsor some refreshments. Coffee, donuts, sandwiches etc. This is a relatively cheap but effective crowd puller.

Try book a meeting/board room that is very visible and has a lot of foot traffic so that people see you as they walk past. Generate interest and curiosity.

One thing you should be aware of is that the presence of managers (or people to whom other people report) will have an effect on how safe people feel and therefore what they’ll be prepared to talk about. I’ve also seen the presence of senior managers change the conversation into questions that should be part of ordinary operations (this is fine if its the aim of you Lean Coffee but it wasn’t the aim of this one).

Resources
* Lean Coffee
* A nice slideshow I found explaining Lean Coffee
* Gerry Kirk’s One Page Intro which I’ve used a lot in the past. I usually have one or two laminated print-outs of this sitting on tables when I facilitate.
* Another great article on Lean Coffee
* Jo’burg Lean Coffee Meetup Group: My ‘home’ Lean Coffee, and where I first learned about it.
* Modus Cooperandi’s Lean Coffee material Modus Cooperandi is Jim Benson’s company. Jim Benson created Lean Coffee.

Please offer your Lean Coffee thoughts, tricks, tips, hacks and resources in the comments!

software development

Issues I’ve seen with Jira (and other similar tools)

TL;DR: Jira (and similar tools) is great for keeping track of bugs, and for tracking, coordinating and auditing work. It is a terrible way to communicate and collaborate across a team. Don’t use it as a communication tool. Rather speak in-person and then document your shared understanding in a tool like Jira, if such documentation is required.


There are some common anti-patterns I’ve seen in organisations that use Jira (and other similar tools). This is what I’ve experienced in several environments: A team goes into Backlog Refinement and/or Sprint Planning and the Product Owner (or whomever is responsible for documenting requirements) projects Jira onto the screen in the boardroom. One by one, each work-item is opened individually and the PO reads through what he or she has written. There’s a little bit of discussion, the delivery team decides they’re ready to play planning poker, they do so and provide an estimate for the work-item, which is captured, and then they move on to the next work-item.

Instead of having a conversation about what problem we’re trying to solve and what outcome we want to achieve, what often happens is everyone just reads what’s been written down. Yes, there is some discussion and a few questions, but fundamentally the conversation is framed (and limited) by what the PO had written down previously (however long ago that was). “We don’t need a conversation because all you need to know is in Jira, just read it.”

Because there is a lot of content, and because the PO clearly took a lot of time and effort to write all of this content, there is a tendency to avoid asking difficult question. “It must be the correct solution, right? The PO must have thought of everything, right?” How likely is the PO to be open to suggestions around solving the problem in a different way? (This is true for everyone, not just POs. Programmers are notorious for being attached to previous work.)

The opportunities for and likelihood of achieving shared understanding in this scenario are severely diminished. “There is so much written down, ‘communication’ must have taken place, right? We all understand what we’re trying to achieve, right?” The likelihood of coming up with a good solution to the problem is also severely diminished.

What typically happens is during development of the feature, there will be multiple ad-hoc face-to-face conversations between the delivery team and the PO (assuming the PO is available to the delivery team), when it is discovered that the written content in Jira is deficient or defective in some way.

What would happen if instead of the PO creating all the requirements documentation and then passing it to the delivery team, the parties had a real conversation, communicated effectively and gained shared understanding. If it is necessary to document this shared understanding, it can then be stored with a tool like Jira. This mitigates a lot of the pitfalls outlined above. I do also think tools like Jira can be very useful for attaching artifacts to work items, like screenshots, designs, wireframes etc.

Another big problem with Jira and similar tools is tool enslavement. Instead of making the tool work for us, we begin to work for the tool. I’ve seen much time and effort wasted through configuring and troubleshooting things like Jira workflows. I’ve seen things like Team A’s Sprint being closed because Team B’s Sprint was correctly closed, and the Sprints happened to have the same name (based on the date). Guess how much fun that was to resolve.

In conclusion, Jira was originally created as a bug/work tracking tool, and its best use is as a tool to plan, coordinate and audit work, not as a communication medium. If you need to document shard understanding, or what work was done when by whom, experiment with capturing this information post-fact, instead of up-front.

Uncategorized

Finding my Tribes and Leveling Up

TL;DR: There are many channels available for leveling-up and for finding your tribe, some of them less ‘traditional’. These are some that I use:
– In-person gatherings (meetups, conferences etc)
– Twitter
– Instant messaging (Slack/HipChat/Jabbr/Gitter/IRC etc)
– Websites (MSDN/Stack Overflow etc)
– Podcasts


This post was prompted by several things: My interview for Source; a conversation I had with two developers I am coaching, who had been thrown into the deep-end on a project with new, unfamiliar technology, and very little support from their company; a conversation with an acquaintance about resources available for leveling up.

A number of years ago, I felt very lonely and misunderstood as a professional developer. I care deeply about my craft and self-development and self-improvement, but struggled to find people with a similar outlook and experience. Part of my frustration was not having anyone with whom to discuss things and soundboard ideas.

I’m glad to say that today my life is totally different. I belong to several tribes, both in meatspace and virtual, have access to a lot more people and resources, with lots of different experiences and points of view. In fact I could probably spend my days only in debate and discussion now, without doing any other work. Besides the communities and resources discussed below, I’m extremely fortunate to be working at nReality where I have amazing colleagues, as well as access to a broad range of people through my training and coaching work.

The resources I use the most these days to level up are
– Meatspace Events
– Twitter
– Slack

Meatspace events are great for many reasons, including: you learn at a lot during the talks; you get to meet awesome, like-minded people and have stimulating conversations. There are a number of great in-person events. The best place to find them is meetup.com.

Of particular significance is DeveloperUG (Twitter) which has monthly events in Jo’burg and Pretoria/Centurion. I owe a massive debt to the founder and past organisers of DeveloperUG, Mark Pearl, Robert MacLean and Terence Kruger for creating such an amazing community.

I am involved in running or hosting these meatspace communities:
Jo’burg Lean Coffee
Jo’burg Domain Driven Design

These meatspace communities are also valuable:
Scrum User Group Jo’burg
Jozi.rb
Jozi JUG
Code & Coffee
Code Retreat SA
Jo’burg Software Testers

Conferences that I’ve attented and spoken at include
Agile Africa
Scrum Gathering
DevConf, and
Let’s Test SA

I haven’t yet had the opportunity to attend others, like JSinSA, RubyFuza, ScaleConf and others, but I know the same applies to them as well. Pro-tip: Getting accepted to speak or present a poster at a conference usually gets you in free, sometimes the conference also pays for your travel and accommodation costs.

As important as meatspace events and communities are, virtual communities provide access to people in other locations. My current way of connecting is finding people and conversations on Twitter, then using Slack to have deeper, ‘better’ conversations. I do have good conversations via Twitter, but its a bit clumsy for a few reasons, and Slack often works better for real conversations.

Twitter and Slack are great for connecting with people for a number of reasons:
– public & discoverable
– low ceremony
– no strings attached

This means that its very easy to start a conversation with anyone, and they’re more likely to respond since its easy to (low ceremony), and they’re not making any kind of commitment (no strings attached).

I’ve been lucky enough to have conversations with some of my idols, like Kent Beck, Uncle Bob, Woody Zuil, Tim Ottinger etc, some on Twitter, some on Slack, some on both.

I belong to these open-to-the-public Slack communities:
– ZADevelopers – South African Developers (invitation on request)
Legacy Code Rocks – All things Legacy Code (technical debt, TDD, refactoring etc)
ddd-cqrs-es – Domain-Driven Design, CQRS, Event Sourcing
Software Craftsmanship – Software Craftsmanship
– Coaching Circles – Coaching, especially Agile, Lean etc (invitation on request)
WeDoTDD – Test-Driven Development
Testing Community – Testing (I joined very recently)

What resources do you use to level up and connect to communities of interest? Let me know in the comments!

software development

How I do TDD

TL;DR: The specification I write are based on domain-level Ubiquitous Language-based specifications of system behaviour. These spec tests describe the functional behaviour of the system, as it would be specified by a Product Owner (PO), and as it would be experienced by the user; i.e. user-level functional specifications.


I get asked frequently how I do Test-Driven Development (TDD), especially with example code. In my previous post I discussed reasons why I do TDD. In this post I’ll discuss how I do TDD.

First-off, in my opinion, there is no difference between TDD, Behaviour-Driven Development (BDD) and Acceptance-Test-Driven Development (ATDD). The fundamental concept is identical: write a failing test, then write the implementation that makes that test pass. The only difference is that these terms are used to describe the ‘amount’ of the system being specified. Typically, with BDD, at a user level, with ATDD, at a user acceptance level (like BDD), and TDD for much finer-grained components or parts of the system. I see very little value in these distinctions: they’re all just TDD to me. (I do find it valuable to discuss the ROI of TDD specifications at different levels, as well as their cost and feedback speed. I’m currently working on a post discussing these ideas.) Like many practices and techniques, I see TDD as fractal. We can get the biggest ROI for a spec test that covers as much as possible.

The test that covers the most of the system is a user-level functional test. That is, a test written at the level of a user’s interaction with the system – i.e. outside of the system – which describes the functional behaviour of the system. The ‘user-level’ part is important, since this level is by definition outside of the system, and covers the entirety of the system being specified.

Its time for an example. The first example I’ll use is specifying Conway’s Game of Life (GoL).

These images represent an (incomplete) specification of GoL, and can be used to specify and validate any implementation of Conway’s Game of Life, from the user’s (i.e. a functional) point of view. These specifications make complete sense to a PO specifying GoL – in fact, this is often how GoL is presented.

These images translate to the following code:

[Test]
public void Test_Barge()
{
    var initialGrid = new char[,]
    {
        {'.', '.', '.', '.', '.', '.'},
        {'.', '.', '*', '.', '.', '.'},
        {'.', '*', '.', '*', '.', '.'},
        {'.', '.', '*', '.', '*', '.'},
        {'.', '.', '.', '*', '.', '.'},
        {'.', '.', '.', '.', '.', '.'},
    };

    Game.PrintGrid(initialGrid);

    var game = CreateGame(initialGrid);
    game.Tick();
    char[,] generation = game.Grid;

    Assert.That(generation, Is.EqualTo(initialGrid));
}

[Test]
public void Test_Blinker()
{
    var initialGrid = new char[,]
    {
        {'.', '.', '.'},
        {'*', '*', '*'},
        {'.', '.', '.'},
    };

    var expectedGeneration1 = new char[,]
    {
        {'.', '*', '.'},
        {'.', '*', '.'},
        {'.', '*', '.'},
    };

    var expectedGeneration2 = new char[,]
    {
        {'.', '.', '.'},
        {'*', '*', '*'},
        {'.', '.', '.'},
    };

    var game = CreateGame(initialGrid);
    Game.PrintGrid(initialGrid);
    game.Tick();

    char[,] actualGeneration = game.Grid;
    Assert.That(actualGeneration, Is.EqualTo(expectedGeneration1));

    game.Tick();
    actualGeneration = game.Grid;
    Assert.That(actualGeneration, Is.EqualTo(expectedGeneration2));
}

[Test]
public void Test_Glider()
{
    var initialGrid = new char[,]
    {
        {'.', '*', '.'},
        {'.', '.', '*'},
        {'*', '*', '*'},
    };

    var expectedGeneration1 = new char[,]
    {
        {'.', '.', '.'},
        {'*', '.', '*'},
        {'.', '*', '*'},
        {'.', '*', '.'},
    };

    var expectedGeneration2 = new char[,]
    {
        {'.', '.', '.'},
        {'.', '.', '*'},
        {'*', '.', '*'},
        {'.', '*', '*'},
    };

    var game = CreateGame(initialGrid);
    Game.PrintGrid(initialGrid);

    game.Tick();
    char[,] actualGeneration = game.Grid;

    Assert.That(actualGeneration, Is.EqualTo(expectedGeneration1));

    game.Tick();
    actualGeneration = game.Grid;
    Assert.That(actualGeneration, Is.EqualTo(expectedGeneration2));
}

All I’ve done to make the visual specification above executable is transcode it into a programming language – C# in this case. Note that the tests do not influence or mandate anything of the implementation, besides the data structures used for input and output.

I’d like to introduce another example, this one a little more complicated and real-world. I’m currently developing a book discovery and lending app, called Lend2Me, the source code for which is available on Github. The app is built using Event Sourcing (and thus Domain-Driven Design and Command-Query Responsibility Segregation). The only tests in this codebase are user-level functional tests. Because I’m using DDD, the spec tests are written in the Ubiquitous Language of the domain, and actually describe the domain, and not a system implementation. Since user interaction with the system is in the form of Command-Result and Query-Result pairs, these are what are used to specify the system. Spec tests for a Command generally take the following form:

GIVEN:  a sequence of previously handled Commands
WHEN:   a particular Command is issued
THEN:   a particular result is returned

In addition to this basic functionality, I also want to capture the domain events I expect to be persisted (to the event store), which is still a domain-level concern, and of interest to the PO. Including event persistence, the tests, in general, look like:

GIVEN:  a sequence of previously handled Commands
WHEN:   a particular Command is issued
THEN:   a particular result is returned
And:    a particular sequence of events is persisted

A concrete example from Lend2Me:

User Story: AddBookToLibrary – As a User I want to Add Books to my Library so that my Connections can see what Books I own.
Scenario: AddingPreviouslyRemovedBookToLibraryShouldSucceed

GIVEN:  Joshua is a Registered User AND Joshua has Added and Removed Oliver Twist from his Library
WHEN:   Joshua Adds Oliver to his Library
THEN:   Oliver Twist is Added to Joshua's Library

This spec test is written at the domain level, in the Ubiquitous Language. The code for this test is:

[Test]
public void AddingPreviouslyRemovedBookToLibraryShouldSucceed()
{
    RegisterUser joshuaRegisters = new RegisterUser(processId, user1Id, 1, "Joshua Lewis", "Email address");
    AddBookToLibrary joshuaAddsOliverTwistToLibrary = new AddBookToLibrary(processId, user1Id, user1Id, title, author, isbnnumber);
    RemoveBookFromLibrary joshuaRemovesOliverTwistFromLibrary = new RemoveBookFromLibrary(processId, user1Id, user1Id, title, author, isbnnumber);

    UserRegistered joshuaRegistered = new UserRegistered(processId, user1Id, 1, joshuaRegisters.UserName, joshuaRegisters.PrimaryEmail);
    BookAddedToLibrary oliverTwistAddedToJoshuasLibrary = new BookAddedToLibrary(processId, user1Id, title, author, isbnnumber);
    BookRemovedFromLibrary oliverTwistRemovedFromJoshuaLibrary = new BookRemovedFromLibrary(processId, user1Id, title, author, isbnnumber);

    Given(joshuaRegisters, joshuaAddsOliverTwistToLibrary, joshuaRemovesOliverTwistFromLibrary);
    When(joshuaAddsOliverTwistToLibrary);
    Then(succeed);
    AndEventsSavedForAggregate(user1Id, joshuaRegistered, oliverTwistAddedToJoshuasLibrary, oliverTwistRemovedFromJoshuaLibrary, oliverTwistAddedToJoshuasLibrary);
}

The definitions of the Commands in this code (e.g. RegisterUser) would be specified as part of the domain:

public class RegisterUser : AuthenticatedCommand
{
    public long AuthUserId { get; set; }
    public string UserName { get; set; }
    public string PrimaryEmail { get; set; }

    public RegisterUser(Guid processId, Guid newUserId, long authUserId, string userName, string primaryEmail)
    : base(processId, newUserId, newUserId)
    {
        AuthUserId = authUserId;
        UserName = userName;
        PrimaryEmail = primaryEmail;
    }

}

Again, the executable test is just a transcoding of the Ubiquitous Language spec test, into C#. Note how closely the code follows the Ubiquitous Language used in the natural language specification. Again, the test does not make any assumptions about the implementation of the system. It would be very easy, for example, to change the test so that instead of calling CommandHandlers and QueryHandlers directly, HTTP requests are issued.

It could be argued that the design of this system makes it very easy to write user-level functional spec tets. However these patterns can be used to specify any system any system, since it is possible to automatically interact with any system through the same interface a user uses. Similarly, it is possible to automatically assert any state in any boundary-interfacing system, e.g. databases, 3rd party webservices, message busses etc. It may not be easy or cheap or robust, but it is possible. For example, a spec test could be:

GIVEN:  Pre-conditions
WHEN:   This HTTP request is issued to the system
THEN:   Return this response
AND:    This HTTP request was made to this endpoint
AND:    This data is in the database
AND:    This message was put onto this message bus

This is how I do TDD: the spec tests are user-level functional tests. If its difficult, expensive or brittle to use the same interface the user uses, e.g. GUIs, then test as close to the system boundary, i.e. as close to the user, as you can.

For interest’s sake, all the tests in the Lend2Me codebase hit a live PostgreSQL database, and an embedded eventstore client. There is no mocking anywhere. All interactions and assertions are outside the system.

software development

Why I do TDD

People give many reasons for why they do Test-Driven Development (TDD). The benefits I get from TDD, and thus my reasons for practicing it, are given below, in order of most important to least important. The order is based on whether the benefit is available without test-first, and the ease with which the benefit is gained without a test-first approach.

1. Executable specification

The first step in building the correct thing is knowing what the correct thing is (or what it does). I find it very difficult to be confident that I’m building the correct thing without TDD. If we can specify the expectations of our system so unambiguously that a computer can understand and execute that specification, there’s no room for ambiguity or misinterpretation of what we expect from the system. TDD tests are executable specifications. This is pretty much impossible without test-first.

I also believe Specification by Example, which is necessary for TDD, is a great way to get ‘Product Owners’ to think about that they want (without having to first build software).

2. Living documentation

The problem with most other forms of documentation is there is no guarantee that it is current. Its easy to change the code without changing the natural-language documentation, including comments. There’s no way to force the documentation to be current. However, with TDD tests, as long as the tests are passing, and assuming the tests are good and valid, the tests act as perfectly up-to-date documentation for and of the code, (from an intent and expectation point of view, not necessarily from an implementation PoV). The documenation is thus considered ‘living’. This is possible with a test-after approach but is harder than with test-first.

3. Focus

TDD done properly, i.e. true Red-Green-Refactor, forces me to focus on one area of work at a time, and not to worry about the future. The future may be as soon as the next test scenario or case, but I don’t worry about that. The system gets built one failing test at a time. I don’t care about the other tests, I’ll get there in good time. This is possible without test-first, but harder and nigh-impossible for me.

4. As a design tool

This is the usual primary benefit touted for TDD. My understanding of this thinking is thus: in general, well-designed systems are decoupled. Decoupling means its possible to introduce a seam for testing components in isolation. Therefore, in general, well-designed systems are testable. Since TDD forces me to design components that are testable, inherently I’m forced to design decoupled components, which are in general considered a ‘better design’. This is low down on my list because this is relatively easy to do without a test-first approach, and even without tests at all. I’m pretty confident I can build what I (and I think others) consider to be well-designed systems without going test-first.

5. Suite of automated regression tests

Another popular selling point for TDD is that the tests provide a suite of automated regression tests. This is absolutely true. However I consider this a very pleasant by-product of doing TDD. A full suite of automated regression tests is very possible with a test-after approach. Another concern I have with this selling point is that TDD is not about testing, and touting regression tests as a benefit muddies this point.

Why do you do TDD? Do you disagree with my reasons or my order? Let me know in the comments!