Guide · Product
AI product discovery: a practical framework for validating use cases
Most failed AI products didn't fail at training. They failed at discovery — built before anyone confirmed the user problem was real, the technology could actually solve it, or the unit economics worked. This guide is the discovery process I use with founders and product teams to decide what's worth building before a single line of model code is written.
Why AI discovery is different
Classic product discovery answers two questions: is this valuable? and is this usable? AI products add two more that you can't defer: is it actually feasible at production quality? and do the unit economics survive scale? Skip either and you ship a demo, not a product.
The lens I use comes from a decade of moving between research, engineering, and product. The framework below treats AI as one tool inside a workflow — not the product itself.
The five-phase framework
- 01
1. Frame the user problem first
Before anything model-shaped, write the problem in plain language: who has it, how often, what they do today, and what 'better' looks like. If you can't describe the workflow without saying 'AI', the use case isn't ready.
- 02
2. Validate user value
Run 5–8 problem interviews with the exact persona. Ask about the last time the problem happened, not hypotheticals. Score each use case on frequency, friction, and willingness to change tools.
- 03
3. Pressure-test technical feasibility
Build a thin spike with the cheapest possible model and a representative dataset. Measure precision/recall (or task success rate) on real examples — not curated demos. Decide what 'good enough to ship' looks like before you keep building.
- 04
4. Check business fit
Estimate per-task cost (tokens, inference, retrieval), the price the user would pay, and the failure cost of a wrong answer. If unit economics only work at scale, write down what scale and how you'll get there.
- 05
5. Decide: build, partner, or kill
Score the use case across the four lenses. Kill candidates that fail value or feasibility — they don't get better by being built. Greenlight only the ones with a clear shippable v1 in 6–8 weeks.
A discovery checklist you can run this week
- One sentence describing the user, the trigger, and the current workaround.
- Five real examples of the task, with what "good" looks like for each.
- A measurable quality bar — what success rate ships, and what's a hard fail.
- Per-task cost estimate at expected volume, with a worst-case multiplier.
- A failure mode plan: what happens when the model is wrong, and who notices.
- A six-to-eight week v1 scope you'd defend to an investor.
Anti-patterns to watch for
Starting from the model
‘We have GPT — what should we do with it?’ usually produces demos, not products. Start from the user workflow and ask whether AI removes a real step.
Demoing on cherry-picked inputs
A discovery prototype that only works on five hand-picked examples is telling you it doesn't work. Measure on a real distribution.
Skipping the cost model
Inference costs compound. A use case that's magical at $0.02 a call and ruinous at $2 is a different product. Model it during discovery, not at launch.
Treating accuracy as the only quality bar
Latency, predictability, and graceful failure often matter more than another point of accuracy. Define the quality bar from the user's job, not the leaderboard.
Where this fits in the broader product mindset
AI product discovery isn't a separate discipline — it's product discovery with sharper teeth. The teams that ship AI products users actually rely on are the ones that treat models as one ingredient inside a workflow, measure quality from the user's job rather than the leaderboard, and kill candidates early instead of building their way out of a weak premise.
If you're staring at a list of possible AI use cases and not sure which one to commit to, let's talk. Discovery is exactly the part I help with.