How to validate a SaaS idea without surveys

Customer surveys are a measurement tool that distorts what it measures. People say they'd buy your thing. People say they'd pay $50/mo for it. Then you ship and nobody buys.

This isn't because users are dishonest. It's because asking someone about a hypothetical purchase activates a different brain than actually paying. The signal you collect from "would you use this?" bears almost no relationship to "did you reach for your wallet?"

Real validation has to look at what people already do, not what they say they would do. Specifically: what they pay for now, what they complain about loudly, what they stitch together themselves out of duct tape because nothing on the market fits.

That's what the rest of this post is about. Five behavioral signal sources, what each is good for, and how to combine them into a validation verdict before you write a line of code.

Why surveys fail (and interviews aren't much better)

Survey-grade signal

✗Would you use a tool that does X? (Yes/No)
✗How much would you pay for it?
✗Rate the importance of feature X
✗Email signups on a fake landing page

Behavioral signal

✓What tools are they paying for that overlap with X?
✓What duct-tape workflows are they posting screenshots of?
✓Where are they complaining loudly about the alternative?
✓What public commits / videos / threads show them doing X manually?

Customer interviews are slightly better than surveys because you can probe behavior. But they're still vulnerable to the social-pressure problem: interviewees want to be helpful, so they manufacture interest. The Mom Test (Rob Fitzpatrick's book) is the right reading on how to interview without inducing this — but the easier route is to just look at behavior people are already producing for free in public.

The five behavioral signal sources

1. Long-form video where people show their actual workflow

The dense, honest source. Faking a 60-minute video where you screen-record a fake workflow isn't worth the production cost; people don't bother. So when you find 10–15 creators showing a workflow that involves manual steps you could automate, that's a strong signal — the manual steps are the duct tape, and removing them is your product.

Watch for:

Phrases like "this is a bit hacky but…" or "I do this manually because…" — explicit pain admissions
Multiple creators converging on the same intermediate format (CSV, spreadsheet, Notion table) — they've independently discovered the same data shape
The creator switching tools mid-workflow ("I export from X and import into Y") — friction at the boundary is your opportunity

"I download the transcript from YouTube, paste it into Claude with a custom prompt, copy the output back into Notion." — three applications, two copy-paste boundaries, all manual. That's a SaaS in disguise.
— Frequent pattern across creator workflows we've analyzed

2. GitHub issues, PRs, and forks of adjacent open-source tools

Open source is the most honest source of behavioral signal there is. People file issues against the tools they use because they're annoyed. They submit PRs to fix specific things. They fork repositories to add features they need but the maintainer won't accept.

Mining technique: pick the 3-5 closest open-source projects to your candidate space. Read their Issues with the most thumbs-up reactions and their list of forks sorted by commits ahead. Both surface real demand:

High-thumbed issues are the things lots of users want and the maintainer hasn't shipped — your potential first feature.
Forks with many commits ahead of upstream are people who needed something so badly they branched. Read their commits — they're your lead-user list.

3. App store / Product Hunt reviews older than 30 days

Initial-launch reviews on PH are noise — they're mostly the creator's network being supportive. The signal is in reviews that accumulate 30+ days after launch, on tools that are still in use.

Specifically: the 3-star reviews. 5-star is fan service, 1-star is often unrelated frustration. The 3-star reviews are users who like the product enough to keep using it but are honest about its gaps. The gaps in 3-star reviews of a successful adjacent product are your wedge.

4. Reddit threads with 100+ comments on niche subs

The volume threshold matters. A 20-comment thread is one person's hot take. A 200-comment thread on r/saas or r/datascience or r/devops is a community converging on something real. The disagreement in the long threads is the signal — when 100 people are arguing about whether tool X or Y is better, both camps are actively in pain about the choice. That's pay-worthy intensity.

5. The duct tape itself

Look for tutorials titled "How I built X with [no-code tool stack]" or "My Notion + Zapier + Make.com workflow for Y". These are people who:

Felt the pain enough to build a workaround
Felt it enough to write about the workaround publicly
Have explicitly demonstrated they'll pay for tools (3 SaaS subscriptions stacked)

Their duct tape is your product spec. The number of distinct tools in their stack is roughly your willingness-to-pay ceiling — if someone's paying $80/mo across 4 tools to do the thing manually, $50/mo for a tool that does it natively is an obvious upgrade.

How to combine the signals into a verdict

Each source individually is suggestive. The verdict comes from triangulating across them. The framework:

Real? Does the pain show up in at least 3 of the 5 sources? If only one source has it, you've found a niche too small or a phantom too vague.
Painful? Are people building duct-tape workarounds, paying for adjacent tools, or arguing publicly? Active workarounds = active pain.
Pay-worthy? Is the existing duct-tape stack costing them money already? "They're paying $X for the alternative" beats every other validation signal combined.
Reachable? Where do they congregate? The signal source is the distribution channel — if you found them on YouTube, you can launch on YouTube.

The triangulation rule

A pain mentioned in 3+ independent source types is real. A pain mentioned 30 times within a single Reddit subreddit but invisible elsewhere is a niche subreddit's pet frustration, not a market.

A worked example

Let me work through a real candidate: "a tool that helps builders turn YouTube research into a product spec". (Yes, our own product. The exercise is a useful self-audit.)

Long-form video: dozens of indie-hacker channels show the workflow "watch videos → take notes → paste into Claude → manually shape into a CLAUDE.md". Multi-tool, manual, pained. ✓
GitHub: several "claude-code-prompts" and "claude.md examples" repos accumulating stars; many forks; issues requesting "auto-generate from research". ✓
Product reviews: NotebookLM reviews praise the single-source experience but lament "I wish I could synthesize across many videos and get a buildable output." ✓
Reddit: r/ClaudeAI and r/indiehackers threads regularly discuss "how do I structure my CLAUDE.md" and "what's the right way to do market research with AI". 100+ comments common. ✓
Duct tape: people stacking NotebookLM + Claude Pro + Notion + Cursor to do manually what a single tool could automate. ~$60/mo across the four. ✓

Verdict: real, painful, pay-worthy, reachable. (Caveat: this is the product I'm biased toward. The exercise still has to be honest. We ran exactly this audit before shipping, and it surfaced two adjacent painpoints we didn't initially scope — which is why STARTER_PROMPT.md exists in the product alongside CLAUDE.md.)

Anti-patterns I see weekly

"My friends said they'd use it"

Friends are surveys with social pressure attached. They'll use it because they like you. They're allowed to like you and not be your market. Validate with strangers who have no reason to be polite.

Email-signup landing pages

Fake-door MVPs measure curiosity, not willingness to pay. A 30% email-signup rate on a $0-cost landing page tells you almost nothing. The behavioral signal is later, when those people actually sign up for the product and pull out a card.

Stopping at "yes there's pain"

Pain is necessary but not sufficient. People are in pain about many things they will not pay to fix. The pay-worthy filter (existing $$$ being spent on duct tape) is the one that separates addressable pain from background suffering.

Confusing volume with intensity

100 mild "yeah this would be cool" comments are worth less than 10 intense "I have spent 40 hours and $200 trying to solve this" comments. Ten passionate users are a market. A hundred casual ones are a newsletter audience.

Stop reading. Start shipping.

Mine the long-form behavioral signal automatically

YouTubeToSaaS does step 1 of this framework end-to-end: drop in 15-20 long-form videos, get back the workflow patterns, the duct-tape mentions, and the explicit pain admissions — synthesized across all sources with citations.

Start free trial See how it works

Closing thought

Validation is mostly a discipline of looking at what people are already doing rather than asking them what they'd do. The five sources above produce more honest signal in an afternoon than a hundred-respondent survey produces in a month, because behavior doesn't lie and stated intent always does.

The hardest part is staying honest with yourself when the evidence doesn't support the idea you wanted to build. The triangulation rule (3+ sources, painful, pay-worthy, reachable) is a forcing function. If your candidate idea fails it, that's the cheapest possible no.