Diagnostic Tool

The Post-Hype Diagnostic

Is Your AI Initiative Grounded in Reality?

A 42-point diagnostic from Engineering Reliable AI Agents & Workflows

The Problem

95% of enterprise AI pilots fail to deliver business value. Not because the technology doesn't work—but because organizations skip the foundational work.

What goes wrong:

• Stakeholders expect transformation in weeks (it takes 6-12 months)
• No one budgets for human oversight teams
• "Success" is defined as "better experience" instead of measurable outcomes
• Data quality issues surface after $200K is spent
• Feedback loops don't exist or take weeks to close

What this diagnostic does: Surface the gaps in problem definition, technical readiness, and organizational alignment before you spend budget. Ten minutes now saves six months of explaining failure later.

How It Works

Phase 1

Go/No-Go Protocol

Five prerequisite questions. All must be "Yes" to proceed. One "No" = stop and fix that gap first.

Phase 2

The Scorecard

14 criteria scored 0-3 across three areas. Total possible: 42 points.

Phase 3

Your Zone

Your score places you in one of four zones. Each zone has specific next-step recommendations.

The Assessment Areas

Part 1: Problem-Solution Fit

Does this problem actually need AI?
Evaluates whether your use case benefits from probabilistic AI—or whether you're using a sledgehammer on a thumbtack.

• Uncertainty tolerance and failure definitions
• Human performance baselines
• Error tolerance during the learning curve

Key Question: Probabilistic Tolerance: Does this use case benefit from a system that "guesses," or does it require 100% precision?

If you need 100% accuracy—financial calculations, regulatory reporting, safety-critical decisions—you probably need traditional software, not AI.

Part 2: Technical Reality

Can your systems and data support this?
"We have data" isn't enough. You need the right data, properly labeled, with documented edge cases.

• Data quality and labeling standards
• Integration complexity and feedback loop speed
• Data sovereignty and compliance requirements

Key Question: Edge Case Documentation: Have you identified the "weird" edge cases that will confuse the model?

Your AI's failure modes are predictable. Teams with a "Golden Dataset" of tricky edge cases before development have far higher success rates.

Part 3: Organizational Readiness

Is your organization set up for AI success?
This is where most initiatives actually fail.

• Stakeholder expectations and solution selection rigor
• Shadow AI awareness and failure budget
• Human oversight resources and escalation paths

Key Question: Stakeholder Expectations: Do stakeholders understand this is a 6-12 month journey, not an overnight transformation?

Executives expecting "transformation" in 8 weeks will kill your project—even if the technology works. Misaligned expectations create impossible pressure.

What Your Score Tells You

Your total places you in one of four zones:

Zone 1: Critical gaps

Do not proceed.

Zone 2: High risk

Address lowest-scoring areas first.

Zone 3: Solid foundation

Phased deployment recommended.

Zone 4: Ready

Ready for systematic execution.

The complete diagnostic includes zone thresholds, detailed definitions, and specific recommendations for each outcome.

Who Should Use This

→

CTOs / VPs of Engineering

Evaluating technical and organizational foundations

→

Product Leaders

Deciding whether to greenlight AI features or redirect resources

→

Project Sponsors

De-risking budget allocation before major investment

→

Technical Architects

Assessing integration complexity and data readiness

→

Operations Leaders

Planning human-in-the-loop processes

→

CFOs

Validating whether an AI budget request is grounded in reality

Run this as a team.

Schedule 90 minutes with your technical lead, a business stakeholder, and someone who does the work you're automating. Disagreements on scores reveal misalignments that will derail you later.

Frequently Asked Questions

What is an AI project risk assessment?

A systematic evaluation of whether your AI initiative has the foundations for success: data quality, human oversight, feedback loops, and realistic expectations. Unlike traditional IT assessments, AI evaluations must account for probabilistic systems. 95% of pilots fail—most due to gaps a proper assessment would catch.

Why do most enterprise AI initiatives fail?

Organizations treat AI as deterministic software when it's probabilistic. Common failures: unrealistic expectations, no failure metrics, no human oversight budget, poor data, slow feedback loops. MIT found only 1 in 20 AI tools reaches production.

How do I know if my organization is ready?

Answer these: Can you tolerate 20% error rates initially? Do you have legal approval for your data? Is there a human escalation path? Can you measure success with numbers, not vibes? Any "no" is a gap to fix first.

What's the difference between AI hype and reality?

Hype: Works perfectly day one, minimal oversight, immediate ROI. Reality: 6-12 months of iteration, dedicated oversight teams, feedback loops, patience. The gap between demos and production is where budgets burn.

Should I assess before or after selecting a vendor?

Before. Always. Assessment reveals whether you have foundations for *any* AI to succeed. Vendor selection first means getting swayed by demos that mask production limitations.

Download the Complete Diagnostic

Get the full 42-point diagnostic with Go/No-Go questions, scoring criteria, zone thresholds, and recommendations.

What you get:

✓ 5 Go/No-Go prerequisite questions
✓ All 14 scored criteria with detailed guidance
✓ Zone thresholds and definitions
✓ Recommendations for each outcome
✓ Printable worksheet format
✓ Notes template for team sessions

Related Diagnostics

The Agent Litmus Test

Determine if you actually need an AI agent or if a simpler workflow will do.

AI Agent Cost & Risk Assessment

Uncover hidden costs. Most teams underestimate by 3-5x.

Evaluation Reality & Maturity Assessment

Are you measuring what matters, or just model accuracy?

From the Book

This diagnostic is one of seven tools in Engineering Reliable AI Agents & Workflows. The book adds case studies, the complete "Post-Hype Audit" with security criteria, and frameworks for closing the gaps this diagnostic reveals.

Learn more about the book →

← Browse all diagnostic tools