Start at the End · Matt Wallaert · 2019
You should know what human behaviour you’re trying to change before you do any meaningful product development work. It creates clarity for the team and can give the team a leading tractable metric to get after. This book outlines a simple and seemingly bulletproof approach to behaviour change.
- Nearly everything we create is designed to shape behaviour, yet we rarely acknowledge this; start with a clear behavioural goal for your creation.
- Behaviour is influenced by promoting pressures (motivations making behaviour more likely) and inhibiting pressures (factors making behaviour less likely); designing interventions involves altering these pressures.
- The Intervention Design Process (IDP) involves:
- Observing and validating a behavioural insight—a gap between current and desired behaviours.
- Crafting a behavioural statement to define the desired outcome.
- Mapping pressures influencing the behaviour.
- Designing interventions to modify pressures.
- Evaluating interventions through pilots, tests, and scaling.
- A behavioural statement outlines the desired behaviour change: When [population] wants to [motivation], and they [limitations], they will [behaviour] (as measured by [data]).
- Pressure mapping identifies promoting and inhibiting pressures affecting behaviour, serving as levers for intervention design.
- Generate numerous intervention ideas without constraints, then select promising ones based on effectiveness and feasibility.
- Conduct an ethical check before piloting interventions, focusing on:
- What behaviour you're changing.
- How you're changing it.
- Ensuring alignment with the population's motivations and values.
- Differentiate between the intention-action gap (people intend but fail to act) and the intention-goal gap (people have a goal but reject the behaviour to achieve it).
- Ethical interventions align with existing motivations and do not impose costs that outweigh benefits or conflict with other motivations.
- Transparency and responsibility are crucial; openly communicate your intentions and be accountable for the outcomes of your interventions.
- Pilots are small-scale, operationally "dirty" interventions expected not to work, aimed at minimising resources and impact while testing for behaviour change.
- Pilot validation uses qualitative and quantitative data to assess if an intervention shows promise, accepting higher p-values due to small sample sizes.
- Tests are larger-scale interventions with greater operational diligence, assessing whether scaling is worthwhile based on effect size and operational cost.
- Scaling decisions are summarised as: We are [confidence] that [intervention] will [direction] [behaviour] (as measured by [data]). Scaling this requires [effort] and will result in [change].
- Documenting failures and continuous measurement are essential to avoid repeating mistakes and to ensure interventions remain effective over time.
Subscribe Button
Quick Links
Embracing Uncertainty: A Modern Take On Strategy, Goals, and Roadmaps · Article
Consumer is Back – And Why It's Been So Hard Since 2014 · Article
Explore vs Execute · Article
Don't let machines or the crowd decide your world · Article
How to think about prompting · Image
Teaching CS50 with AI · Paper
The Rationale Behind the Empathy Gradient · Article
A refactoring guide for product managers · Article
Utility is in the Eye of the User: A Critique of NLP Leaderboards
Kawin Ethayarajh, Dan Jurafsky. 2020. (View Paper → )
Benchmarks such as GLUE have helped drive advances in NLP by incentivising the creation of more accurate models. While this leaderboard paradigm has been remarkably successful, a historical focus on performance-based evaluation has been at the expense of other qualities that the NLP community values in models, such as compactness, fairness, and energy efficiency. In this opinion paper, we study the divergence between what is incentivised by leaderboards and what is useful in practice through the lens of microeconomic theory. We frame both the leaderboard and NLP practitioners as consumers and the benefit they get from a model as its utility to them. With this framing, we formalise how leaderboards – in their current form – can be poor proxies for the NLP community at large. For example, a highly inefficient model would provide less utility to practitioners but not to a leaderboard, since it is a cost that only the former must bear. To allow practitioners to better estimate a model’s utility to them, we advocate for more transparency on leaderboards, such as the reporting of statistics that are of practical concern (e.g., model size, energy efficiency, and inference latency).
DeepMind and other companies are moving away from using leaderboards. What gets measured gets managed, and leaderboards create strange incentives. They’re great for a ground truth problem like protein folding (CASP) but they aren’t nuanced enough to capture all of the non-goals and important ethical considerations.
Book Highlights
Next, through quantitative research, a company can determine which of those jobs are most important and least satisfied and will make the most attractive markets to target for growth. Anthony W.Ulwich · Jobs to Be Done
Resilient systems build in fail-safes so that when something breaks down, the next step to recover is obvious. Make your habit a resilient system Michael Stanier · The Coaching Habit
But this kind of technology takes you out of your life. It interrupts you, often with information you don’t need, because it is delivered by default. Amber Case · Calm Technology
Algorithms that employ usage data are called collaborative filtering. Algorithms that use content metadata and user profiles to calculate recommendations are called content-based filtering. A mix of the two types is called hybrid recommenders. Kim Falk · Practical Recommender Systems
Quotes & Tweets
80% of the time you’re wrong about what a customer wants. Avinash Kaushik
One accurate measurement is worth more than a thousand expert opinions. Grace Hopper