The Illusion of Insight: When Blind Spots Become Strategy

Jul 27

The Problem

AI can supercharge how we work. It spots patterns we miss, automates the mundane, and makes insight generation faster and more scalable than ever before. When powered by clean data and clear goals, it’s a force multiplier.

But here’s the catch: AI doesn’t know what’s real. It doesn’t know when the data it’s trained on is incomplete, manipulated, or misleading. It doesn’t know when the question it’s answering is the wrong one. And it certainly doesn’t know when its output is being used out of context.

Sam Altman, CEO of OpenAI, put it plainly:

“AI will do whatever it’s told to do — and it will hallucinate if it has to.”

— Sam Altman

That’s the real risk. Not that AI makes mistakes — but that it makes them confidently. Repeatedly. At scale.

The bigger danger isn’t the algorithm. It’s how we use it. When teams assume AI is “smart enough,” they stop asking questions. They don’t dig into the data. They don’t pressure-test the results. And that’s when things go off the rails.

The issue isn’t bad AI. It’s blind trust — in the data, the model, and the output — without understanding where each comes from or how it’s being applied.

A Real-World Example: The Chargeback Mirage

While advising a fintech client, we were brought in to analyze chargeback data for their IT infrastructure — servers, storage, mainframes, and more. The goal was clear: identify areas to optimize, trim excess, and reduce spend. But as we dug into the data, things didn’t quite add up.

On the surface, the dataset appeared organized and ready for analysis. But underneath, the numbers had been shaped to tell a particular story — one that aligned with internal cost targets, not operational truth. Volumes weren’t actual counts of infrastructure; they had been manipulated to align with fixed unit pricing baked into legacy contracts. Meanwhile, resources were bundled together in ways that obscured cost and performance differences, making it difficult to assess true unit economics. And in some cases, line items were still being billed using outdated rates that no longer reflected actual costs.

It wasn’t just a data problem — it was a systems problem, and more importantly, a people problem. These distortions had built up over time through shortcuts, shifting incentives, and attempts to simplify billing. By the time AI tools were brought in, the data looked clean — but it no longer told the truth.

The kicker? None of this manipulation was malicious. It had evolved over time — spreadsheets passed across teams, assumptions hardened into systems, and the veneer of structure masked a broken foundation.

What helped us cut through the noise wasn’t just smarter tooling — it was conversation. Speaking with procurement teams, operations leads, and on-the-ground engineers revealed what the data couldn’t: the backstories, workarounds, and historical baggage that shaped how the numbers came to be.

The Risk Isn’t Bad AI — It’s Scalable Wrongness

The chargeback data example we explored earlier didn’t involve AI—but that’s exactly the point. It revealed how even basic data, when poorly structured and misinterpreted, can drive faulty conclusions and missed opportunities. Now imagine that same flawed data plugged into an AI model—one that generates recommendations, forecasts costs, or allocates resources. The model wouldn’t question the inputs. It would scale them.

This is the real risk: not AI making isolated errors, but AI accelerating flawed thinking when no one pauses to interrogate the foundation.

That pattern plays out across industries:

Healthcare – Risk prediction that reinforced bias:
A 2019 study in Science revealed that a widely used algorithm by Optum significantly underestimated the health risks of Black patients. The model used historical healthcare spend as a proxy for need—an approach that overlooked systemic disparities in care access. Black patients, who often incurred lower costs due to underdiagnosis and undertreatment, were deprioritized for critical interventions. The issue wasn’t bad math—it was a flawed assumption, never questioned.
Recruiting – Biased models built from biased histories:
Amazon abandoned an internal AI recruiting tool after realizing it penalized resumes from women. The model had trained on past hiring patterns, which favored men. Without oversight, it learned to associate male-coded language and credentials with stronger candidates—replicating existing biases under the guise of objectivity.
Retail – When models fail to adapt:
Post-pandemic, companies like Target and Walmart leaned on AI models to predict demand. But as consumer behavior shifted again, those models—still tuned to lockdown-era trends—overestimated demand for home goods and electronics. Target ended up with over $1 billion in unsold inventory. The models weren’t broken; the context had changed, but the assumptions hadn’t.

These aren’t one-off cautionary tales from the past—they reflect a systemic challenge organizations are still grappling with. A 2021 MIT study found that 65% of business leaders said they would trust AI-generated insights, but only 23% regularly audited the data used to train those models. Gartner estimates that through 2027, 75% of organizations will experience visible business disruptions due to flawed or biased data used in AI models. And a McKinsey report notes that less than 30% of companies have clear protocols in place to track model drift or audit training data regularly.

To see how this plays out in practice, let’s return to our earlier chargeback example. Imagine if our client had blindly trusted the manipulated chargeback data and used it to guide cost-reduction strategies. An AI model trained on that data might have flagged certain IT resources as disproportionately expensive—recommending volume cuts to services the client wasn’t even using, or suggesting contract renegotiations for “high-cost” units that didn’t actually exist because multiple resources had been bundled together. None of these actions would have addressed the real issue: missing context around how different IT services were provisioned and tracked. But the model’s outputs would have looked rational—and even urgent. That’s the danger: confidently scaling the wrong conclusion.

What ties all of this together isn’t faulty technology—it’s unquestioned inputs and unexamined logic. AI isn’t some all-seeing oracle; it mirrors the data and incentives it’s fed. And when teams treat AI outputs as inherently trustworthy, they risk codifying error, bias, or misalignment at scale.

In organizations without a culture of healthy skepticism—where teams don’t pause to ask “does this make sense?” or “what’s driving this result?”—AI becomes less of a tool and more of a trap. Assumptions go unchallenged. Data gets recycled, not refined. And flawed conclusions harden into strategy.

The Fix Isn’t Smarter AI — It’s Smarter Systems and Culture

The solution to bad AI outcomes isn’t more AI. The real fix is far more human: building systems that keep data accountable, and cultures that encourage people to question what they’re seeing—especially when it confirms what they want to believe.

If we return to the chargeback example, it wasn’t an algorithm that uncovered the manipulation. It was people—curious, skeptical people—who asked the right questions and followed up on inconsistencies. AI could have modeled the data faster or identified some patterns more cleanly, but it couldn’t have challenged the assumptions hiding behind the numbers. That work requires judgment, context, and above all, a culture that rewards asking “why?”

Organizations don’t need complex audits or formal checklists to start fixing this. They need a few deliberate habits that help surface flawed data before it’s scaled, and a leadership tone that makes it safe to challenge the output—especially when it comes dressed as insight.

Data Hygiene: Build Systems That Surface Problems Early

The root cause of most AI missteps isn’t the model—it’s the data. And while no organization is immune to messy data, the ones that avoid major failures tend to have checks built in to catch issues before they’re scaled.

Some practical safeguards and habits include:

Treat data quality as a process, not a project. It’s not enough to clean data once before a model is deployed. Set up routines—monthly reviews, field audits, or system health checks—so issues don’t fester.
Interrogate the incentive structure. Ask what behaviors may have shaped the data—what was the team optimizing for when the data was entered, bundled, or categorized this way?
Understand context, not just timing. It’s not only when the data was collected that matters, but why. What decision, process, or constraint was shaping the inputs at the time? Are those factors still relevant?
Surface temporal drift. Historical rates, categorizations, or logic often lag behind operational shifts. Highlight variables that haven’t been updated in months or years.
Make metadata transparent. Track where each field comes from, how it’s calculated, when it was last touched, and who’s accountable for it. It doesn’t need a fancy AI ops tool—just a shared Notion page, a maintained spreadsheet, or an internal wiki where everyone can access and audit the details.

Ultimately, these practices rely less on tooling and more on ownership and intentionality. If a model can’t explain its inputs, a human needs to be able to.

Culture & Leadership: Reward People Who Ask the Right Questions

Even the best systems fail when the culture discourages dissent. This is why leadership tone matters so much—because when people feel pressure to validate a decision or hit a deadline, the quality of questioning suffers. Instead of probing assumptions, teams go hunting for proof.

To change this, leaders must model intellectual honesty. That means celebrating when someone spots a flaw in the logic, not penalizing them for slowing things down. It means rewarding people for raising concerns early, before they snowball into PR crises or compliance issues. And it often means bringing in an outside perspective—a fresh pair of eyes unencumbered by internal politics or sunk cost bias—to challenge what the org thinks it knows.

External advisors and partners can play a valuable role here. Not just in data cleaning or model validation, but in shaping the kinds of questions an organization is willing to ask itself. When teams get too close to the data, they often stop noticing what’s missing.

What separates strong cultures from fragile ones isn’t how fast they move—it’s how often they pause to question their direction. A few principles that help:

Leaders must set the tone. When executives openly question reports or models—especially when they come from senior teams—it signals that it’s not only allowed, it’s expected.
Create space for friction. Fast-moving organizations often suppress dissent in favor of quick wins. But friction, when channeled well, protects against deeper failures.
Encourage second-order thinking. Go beyond “what is the model saying” to “why is it saying this?” and “what might it be missing?”
Reward early intervention. It should feel safe—and even encouraged—for a junior analyst or frontline ops lead to raise their hand and say, “this doesn’t look right.”
Know when to bring in new perspectives. A fresh set of eyes—internal or external—can help spot the blind spots a team has stopped seeing.

What Smart Orgs Do Differently

Checks, Balances, and a Healthy Dose of Skepticism!

Remember: Garbage In, Gospel Out

AI doesn’t make up its mind — it reflects yours. That means the work of scrutiny starts before the model ever runs. It begins with a close look at the data: how it was collected, why it was collected, how it’s been transformed, and where assumptions may have quietly slipped in. The more foundational this work is, the less downstream damage there will be.

But that scrutiny can’t stop once the model generates an output. If the answer feels “off,” don’t rationalize it — investigate. Is the insight real? Is it repeatable? Did the model lean on the wrong signal or misinterpret proxy data? Asking these questions isn’t a sign of distrust in AI — it’s how you build qualified trust.

Still, asking the right questions can be hard when you’re buried in the weeds. That’s where an outside perspective can help. Someone with distance from the data — but deep familiarity with how models, metrics, and incentives collide — can help surface the blind spots you’ve stopped seeing.

In the end, the real danger isn’t that AI will get things wrong. It’s that we’ll treat its answers as gospel — even when the data was garbage (at least for this purpose and in this context) to begin with.

Ready to question what your AI systems take for granted? Let’s talk.

Authored by Aryanshi Kumar

Aryanshi Kumar, an alumnus of IIT Delhi & Wharton, is a former consultant with BCG (Chicago) and has worked extensively across small & large organisations, helping clean, condense, and analyse data to generate actionable insights.

Aryanshi Kumar