How do you prevent metrics from being gamed or creating perverse incentives?
Why This Is Asked
Interviewers want to see that you understand Goodhart's Law—when a measure becomes a target, it ceases to be a good measure. They're looking for evidence that you design metrics thoughtfully, use multiple indicators, and create a culture where honest improvement matters more than hitting numbers.
Key Points to Cover
- Using balanced scorecards and multiple metrics to avoid single-target gaming
- Focusing on outcome metrics over activity metrics where possible
- Encouraging transparency and psychological safety so people report honestly
- Reviewing metrics design and adjusting when gaming emerges
STAR Method Answer Template
Describe the context - what was happening, what team/company, what was at stake
What was your specific responsibility or challenge?
What specific steps did you take? Be detailed about YOUR actions
What was the outcome? Use metrics where possible. What did you learn?
💡 Tips
- Give an example of a metric that was gamed and how you addressed it
- Show you treat metrics as signals, not goals—the real goal is better outcomes
✍️ Example Response
STAR formatSituation: At a previous company, we measured "story points completed per sprint." Teams started inflating point estimates—a 2-point task became 5 points. Velocity went up, but delivery didn't. We had created a perverse incentive.
Task: I was asked to redesign our metrics to drive real improvement without gaming.
Action: I moved to a balanced scorecard: cycle time (hard to game—it's measured from commit to production), deployment frequency, change failure rate, and customer-reported bugs. I avoided single-metric targets—we looked at trends across all four. I also added qualitative inputs: blameless post-mortems, team health surveys, and stakeholder NPS. I made it explicit in team norms that we valued honest reporting over hitting numbers—and I modeled that by celebrating when someone surfaced a problem early. When I noticed a team optimizing for deployment frequency by shipping trivial changes, I raised it in a retrospective and we adjusted: we started tracking "meaningful deployments" (features or fixes with customer impact).
Result: Gaming dropped significantly. Cycle time became our primary improvement driver, and we saw a 35% reduction in time-to-production. I learned that Goodhart's Law is real—when a measure becomes a target, it ceases to be useful. Multiple indicators and a culture of honesty matter more than any single metric.
🏢 Companies Known to Ask This
| Company | Variation / Focus |
|---|---|
| Amazon | Earn Trust, Insist on Highest Standards — "How do you prevent metrics gaming?" |
| Integrity, data-driven without gaming | |
| Meta | Impact at scale, honest measurement |
| Microsoft | Growth mindset, customer focus |
| Netflix | Candor, high performance without gaming |
| Stripe | Technical judgment, moving fast with integrity |