Mind the Gap: AI Won’t Think for You

Mind the Gap: AI Won’t Think for You

Recently, I asked an AI tool to help me analyze some product data. It gave me back:

  • A polished, confident breakdown of the “growth trend” in our signup funnel
  • Conclusions that sounded… plausible
  • And exactly zero correct insights

Turns out, the trend didn’t exist. The math didn’t hold. It had skipped sanity checks that any half-awake analyst would do. But because it seemed structured and looked clean, It sounded right enough that I almost ran with it—which is exactly the danger.

LLMs improvise, not analyze, unless you steer them.

It doesn’t validate, doesn’t question, and it sure as hell doesn’t understand your context. Unless you make it. And that’s the problem.

What Raw AI Gets Wrong (And Why It Matters)

The default AI experience Is deceptively useful, that's why most people use LLMs like a vending machine:

  • Type in a vague prompt
  • Get back a fancy-sounding answer
  • Squint hard enough until it looks usable

Behind every “meh” AI output is a human who gave it a half-baked input. I’ve seen it again and again—in product spec drafts, data reviews, strategy docs, even team comms. The AI doesn't underperform. You just forgot to teach it how you think.

Here’s where most LLMs fall apart when used off-the-shelf:

  • They write like they just binged 100 LinkedIn posts and learned nothing.
  • They draw conclusions with no sanity checks, as if metrics live in isolation.
  • They’re structured, but spineless—no real point of view.
  • They’re confident, but generic—and sometimes more often than not, are just flat-out wrong.
  • Worst of all: they’re polite about it. Which means you don’t even realize how much they’re screwing up until it’s too late.

I used to think AI was like a junior PM: promising, quick, but needs guidance. I take it back.. It’s not a junior you’re onboarding—It’s an infant. Smart? Yes, Observant? Sure. But absolutely clueless unless you teach it everything from tone to logic to what “good” even looks like. You wouldn’t hand a baby your strategy doc and expect brilliance.

But with LLMs? People do that every day.

How I Got It to Actually Work for Me

There was a moment—somewhere around the third time an LLM confidently misread a CSV file—that I realized I needed to rethink how I was using these tools.

At first, I treated them like answer engines: ask a question, get an answer, move on. But the reality is: LLMs aren’t experts. They’re fast, fluent generalists—without your context, judgment, or nuance. And if you don’t give them that, they’ll default to sounding helpful while quietly drifting off-track.

So I changed my approach.


1. Stop treating AI like a genie and start treating it like a collaborator

This clicked recently when a colleague shared this excellent blog post by Harper Reed. In it, Harper describes how he sets up LLM interactions to be more conversational and iterative—less like Q&A, more like a whiteboard session with a smart partner.

Reading it took me back to the kind of messy, high-energy, pre-COVID brainstorms where clarity comes through the chaos—not before it. That’s when I realized I could use AI not just for answers, but for exploration.

Instead of asking:

# Write a product strategy for X.        

I now frame it like I’m onboarding a teammate:

# You’re a PM working on monetization. You prefer clarity over jargon. I’ll share context—ask questions if it’s unclear. Help me think out loud.        

The responses became sharper. More grounded. Less generic. It wasn’t about “getting it right” on the first try—it was about thinking together.


2. Give it meaningful product and team context

A generalist model doesn’t know your business, your org structure, or your users. So I started sharing light, public-safe context upfront:

  • What our product does and who it serves
  • Our general GTM motion and user types
  • My role, and how I typically approach strategic problems

With that, the AI stopped sounding like a generic blog post and started responding more like someone who had been around our world for a while.

(Note: This is not a CTA to share sensitive data or internal docs. Context doesn’t require secrets—it requires clarity.)


3. Stop winging your prompts and start structuring them like a brief

Once I started feeding the AI real context, I realized context alone wasn’t enough. I still needed to shape how I was asking.

So I ditched the vague “Can you help me with this?” prompts and started using real scaffolds—short, reusable templates that force clarity before creativity.

These are the two I use constantly:

Format → Reference → Request → Framing

I use this when I want the AI to stick to a clear structure, pull from something I’ve already shared, and stay focused. For example:

# Respond in a table with 3 columns: Question, Insight, Follow-up.
# Use the attached interview transcript as source material.
# Summarize key user pain points.
# You’re a UX researcher presenting this to product leadership—keep it concise but insight-rich.        

Goal → Format → Warning → Context Dump

This one’s my go-to when I’m asking the AI to help with something strategic, complex, or ambiguous—where it might otherwise spin off the rails. For example:

# I need help drafting a strategic narrative for our Q3 roadmap.
# Structure it as a 3-part memo: context, challenge, and proposed focus.
# Do not assume growth is good unless you’ve validated it. Watch for data hallucination.
# [Insert everything I’ve got: key metrics, prior decisions, leadership input, known tensions, etc.]        

Is this overkill? Sometimes. But 9 times out of 10, these formats save me from bad drafts and back-and-forth. They force me to clarify what I actually want before hitting send.

And the best part? Once you’ve written a few of these, you start thinking more clearly—even when the AI’s not involved.


4. Treat every response as a draft

If an LLM says:

# Growth increased by 40%.        

My first reaction is now:

# Compared to what? Over what time period? Based on which metric?        

It’s not about doubting the tool—it’s about applying the same healthy skepticism I’d use with any analysis. AI is excellent at correcting itself, but only when you challenge it.

So I pressure-test responses the same way I’d review a junior analyst’s first take: respectfully, but critically.


5. If you do one thing—set up custom instructions

Seriously. This is the hill I will die on.

Custom instructions are the single most underrated unlock when working with LLMs. They don’t just “improve” results—they completely change the nature of the interaction.

Without them, you’re working with a blank-slate intern who’s overly polite and wildly confident. With them? You’ve got a co-pilot that talks like you, thinks (kind of) like you, and actually delivers useful output on the first try.

Here’s what I include in mine:

  • How I write: Structured, sharp, and allergic to vague takeaways
  • How I think: Start with sanity checks, layer context, then evaluate tradeoffs
  • What I want: Specific, reasoned, and written like someone who’s shipped real product
  • What to avoid: Generic templates, filler sentences, or overly sanitized “advice”

And once that’s set, I take it one step further:

# Here’s how I asked you to behave. Based on your last few answers—what did you miss?        

The results speak for themselves. It took time to get it dialed in, but now? I don’t repeat myself. I don’t wrestle the tone. I don’t need to explain what “good” looks like in every new thread. Set it up once. Benefit every time.

If you’re skipping custom instructions, you’re choosing to work with a version of the AI that doesn’t know you exist. Fix that.


6. Compare models when you need a second opinion

Sometimes, when I’m working through a complex prompt—strategy framing, user messaging, or anything nuance-heavy—I’ll run the same input through multiple LLMs (ChatGPT, Claude, Gemini, Perplexity, etc.)

Not to “benchmark,” but to surface blind spots. Each model brings its own flavor: one might be better at structuring, another more nuanced with tone, another more risk-aware.

Then comes the fun part. I’ll ask one model to critique the others:

# Here are four different responses to the same prompt. Which one is strongest, and why?        

The irony? Sometimes a model will vote against its own response.

Like an honest panelist on a debate show going, “Yeah, actually… the other one nailed it.” When that happens, I know I’m onto something.

This isn’t scientific. But it is a great way to pressure-test thinking without relying too heavily on a single tool’s output. It also reminds me that AI’s real strength isn’t certainty—it’s perspective. And when you stack those perspectives thoughtfully, you get something stronger than what any one model would’ve come up with solo.


In Case You Skimmed Everything Above…

This isn’t about prompt engineering for the sake of it. It’s about respecting the tool enough to set it up properly.

  • LLMs don’t think for you. They don’t reason. They mirror.
  • If you prompt like a tourist, you’ll get shallow results.
  • If you prompt like an owner—with structure, tone, and context—you’ll get a partner that actually pulls its weight.

And once you do? It stops feeling like automation. It starts feeling like leverage.

Onwards & Upwards, Over & Out.

Natalie Williams

Digital Media Operations | Content Strategy | Product Management | B2B Marketing | Executive Leadership

7mo

This is great! Thanks for sharing what is working for you.

To view or add a comment, sign in

Others also viewed

Explore content categories