Most founders hire an AI advisor the same way they'd hire any consultant: they look for domain credentials, ask for references, and pick the person who sounds most confident about the technology. This process selects for exactly the wrong qualities. It rewards fluency over depth, optimism over rigour, and vendor relationships over architectural judgement.
U.S. companies spent $37 billion on generative AI alone in 2025, according to Harvard Business Review. Yet 71% of global chief information officers said their AI budgets would be frozen or cut if value couldn't be demonstrated within two years. The gap between spending and returns is widening, and much of it traces back to the advice founders received before writing their first line of code.
Choosing the right AI advisor isn't a hiring decision. It's a technical architecture decision dressed as a people problem.
The advisor trap most founders walk into
Why domain expertise alone misleads
A founder building an AI-powered logistics platform will naturally seek someone who knows logistics. This instinct is sound but insufficient. Domain experts who layer AI onto existing mental models tend to replicate manual processes with automation rather than rethinking the problem space entirely.
The more dangerous version: domain experts who completed one successful AI implementation and now treat that single pattern as universal. They'll recommend the same architecture, the same vendor stack, the same data pipeline regardless of whether your constraints resemble the ones they solved before. MIT Sloan research describes this as a familiar pattern: business leaders mistake early-stage AI breakthroughs for mature use cases, experience FOMO, and end up with implementations that fall short of expectations.
Domain knowledge matters. But it should inform the problem definition, not dictate the technical approach.
The difference between advice and implementation capability
Harvard Business Review reported on a telling exercise: when MBA students were asked to define what consultants do, they used phrases like "trusted advisor," "problem-solver," and "subject matter expert." None of them mentioned "results." From their perspective, consultants generate advice that clients are expected to turn into results, rather than producing results themselves.
This framing problem runs deep in AI advisory. An advisor who can explain transformer architectures at a whiteboard but has never shipped a production inference pipeline will give you architecturally elegant recommendations that collapse under real-world latency requirements, data quality issues, and cost constraints. The advice sounds right. The implementation fails anyway.
What you want is someone who has felt the pain of a model that performed brilliantly in evaluation but degraded in production because the training distribution didn't match real user behaviour. That kind of scar tissue can't be acquired through reading papers or attending conferences.
What "AI experience" should actually mean
Technical depth vs. vendor fluency
There is a specific kind of AI advisor who can walk you through every major platform's feature matrix, recite pricing tiers from memory, and draw integration diagrams on demand. This person is a vendor expert, not a technical one.
Technical depth means understanding why a retrieval-augmented generation pipeline might outperform fine-tuning for your use case, or why it might not. It means knowing when a purpose-built model will outperform a general-purpose large language model. As Akamai's Robert Blumofe told MIT Sloan, large language models can be "a ridiculously expensive way to solve certain problems." Most enterprise AI problems require what he calls an "ensemble of technologies" brought together for a purpose-built solution rather than a single approach.
An advisor with genuine technical depth will tell you things you don't want to hear about your current architecture. A vendor-fluent advisor will tell you which product to buy.
Reading the difference between hype cycles and production readiness
Research from Gartner and academic work published on arXiv describe generative AI adoption as following a dual-stage process: the familiar hype cycle (technology trigger, peak of expectations, trough of disillusionment, slope of enlightenment, plateau of productivity) layered with emotional stages of organisational change including shock, denial, and integration.
Your advisor should be able to place specific technologies on this curve with precision. Not "AI agents are the future" but "agentic systems still experience ongoing hallucinations and security vulnerabilities like prompt injection, and experts predict a decade or more before these issues are fully resolved, though deployment with human-in-the-loop oversight may come sooner." That level of nuance, drawn from the MIT Sloan 2026 AI decision-makers' analysis, separates someone tracking the field from someone parroting keynote slides.
Questions that expose surface-level knowledge
Ask your prospective advisor these questions and listen to how they answer, not just what they say:
- "When would you recommend against using a large language model for this problem?" If they can't articulate specific scenarios where simpler approaches win, they're anchored to a single paradigm.
- "What's the most common failure mode you've seen in production AI systems?" Vague answers about "data quality" suggest surface familiarity. Specific answers about distribution drift, feedback loops, or evaluation methodology suggest real experience.
- "How would you approach this if our engineering team were half its current size?" Constraints reveal thinking quality. An advisor who only knows how to solve problems with more resources hasn't solved hard problems.
- "Walk me through how you'd evaluate whether a model is production-ready versus demo-ready." MIT Sloan's research highlights that LLM success at simple tasks like email classification can represent "success theatre" rather than solutions to complex enterprise problems. Your advisor should understand this distinction viscerally.
How to assess strategic fit with your stage
Pre-product vs. scaling vs. optimising: different advisory needs
A pre-product startup needs an advisor who can help identify where AI creates defensible value in the product, not someone who optimises inference costs. A scaling company needs someone who understands how to move from a working prototype to reliable infrastructure serving thousands of concurrent users. A company optimising existing AI systems needs someone who can identify where you're leaving performance on the table.
These are different skill sets. The advisor who excels at zero-to-one product thinking may be the wrong person to help you reduce your compute bill by 40%. Ask explicitly which stage they've worked at most, and probe for specifics.
Whether they understand your constraints, not just the technology
Research published on arXiv examining AI deployment in public systems found that many AI systems fail at deployment rather than during model development, even when they perform well in internal testing. The Institutional Alignment Readiness framework they developed assesses five dimensions: institutional and operational compatibility, data ecosystem maturity, human oversight capacity, fiscal sustainability, and regulatory alignment readiness.
These dimensions apply to startups too, if in different proportions. An advisor who talks exclusively about model performance without asking about your data infrastructure, your team's capacity to maintain what gets built, or your runway relative to the implementation timeline is advising in a vacuum.
The gaps between technical viability and responsible deployment are, as the researchers note, most acute in resource-constrained settings. Startups are resource-constrained by definition.
Red flags in how they frame timelines and ROI
Be wary of any advisor who offers confident ROI projections for AI initiatives before understanding your data, your team, and your existing systems. Harvard Business Review's survey of AI investment returns found that isolated, piecemeal deployments, limited executive buy-in, and weak linkage to strategic goals is a pattern that recurs across companies of all sizes. Without a systematic way to decide where to start, how fast to move, and when to stop, AI efforts become a drain on attention and resources rather than a source of advantage.
An advisor worth hiring will resist giving you a timeline before they've assessed your starting position. They should frame initial engagements as diagnostic rather than prescriptive.
The build-vs-buy question they should help you think through
Advisors who default to custom builds
Some advisors reflexively recommend building custom AI systems. This bias often correlates with advisors who sell implementation services or who built their reputation on bespoke technical work. Custom builds do create genuine advantages: Deloitte's analysis of generative AI strategy notes that building enables tailored functionality and robust data security, though it requires greater investment.
But "greater investment" understates the ongoing commitment. Custom AI systems need monitoring, retraining, evaluation infrastructure, and dedicated engineering attention indefinitely. For a startup with twelve engineers, building a custom recommendation engine when a well-integrated third-party solution would serve the same purpose is a misallocation of scarce engineering capacity.
Advisors who default to off-the-shelf
The opposite bias is equally dangerous. Advisors who consistently recommend buying off-the-shelf tools may be optimising for speed of implementation at the expense of differentiation. Deloitte's research acknowledges that buying can lower costs and accelerate implementation, though it may compromise on privacy and flexibility.
The strategic question isn't speed. It's whether the AI capability you're building is a commodity or a competitive advantage. If your AI feature is table stakes in your market, buy it. If it's your moat, you'd better own the technical stack beneath it.
Harvard Business Review has argued that generative AI is dissolving the economic logic that made standardised enterprise software the only practical choice. Leaders must ask which workflows they actually need to own. Your advisor should help you answer that question rather than defaulting to either direction.
What a balanced perspective sounds like
A good advisor on build-vs-buy will ask about your competitive dynamics before your technical requirements. They'll distinguish between AI capabilities that should be proprietary and those that should be purchased. They'll factor in your team's maintenance capacity alongside the initial build cost.
Deloitte's analysis of AI-assisted software engineering found that Klarna replaced its Salesforce CRM with a GenAI-built internal platform, reducing the developer requirement from 20 to 5 people. That's a compelling example of building, but it required Klarna's specific combination of engineering talent, scale, and strategic commitment. Your advisor should be able to articulate why a similar approach would or wouldn't apply to your situation, not just cite the case study as proof that building always wins.
The scaling model for AI-assisted teams is shifting from "more developers equals more output" to "more context per developer equals more impact." An advisor who understands this shift will think differently about build-vs-buy than one still operating on pre-AI assumptions about engineering productivity.
Incentive alignment and engagement structure
Equity, retainer, or project-based: what each reveals
How an advisor structures their engagement tells you what they optimise for.
Equity-based arrangements align the advisor's interests with your long-term success but can create perverse incentives around fundraising narratives. An advisor with equity might encourage you to build impressive demos that inflate valuation rather than robust systems that serve users.
Retainer arrangements provide stability and ongoing access but can become comfortable. Without clear deliverables, retainer relationships drift toward general availability rather than focused impact.
Project-based engagements create accountability around specific outcomes but can lead to advisors optimising for project completion rather than your broader strategic position. They might solve the scoped problem while ignoring adjacent issues that matter more.
The structure that works best depends on your stage and needs. What matters is whether the advisor can articulate why they prefer a particular structure and what trade-offs it creates. If they can't discuss the downsides of their own preferred model, they haven't thought critically about their own incentives.
How to test for vendor-agnostic thinking
Ask your prospective advisor which cloud provider or AI platform they'd recommend, then ask them to argue against their own recommendation. An advisor locked into a single ecosystem (through partnerships, certifications, or familiarity) will struggle to make a convincing case for alternatives.
MIT Sloan's 2026 analysis of AI decision-making emphasises that organisations implementing generative AI predominantly take an individual-level approach to boost employee productivity rather than applying it to enterprise workflows and processes. This pattern often reflects vendor-driven thinking: the tools are designed for individual productivity, so that's what gets implemented. An advisor who thinks at the systems level will push beyond individual-tool adoption toward workflow-level transformation.
When an advisor's network becomes a liability
Advisors with strong vendor relationships can provide valuable introductions and negotiating leverage. But those same relationships create bias. If your advisor has a referral arrangement with an infrastructure provider, their recommendation to use that provider should be treated with appropriate scepticism.
Ask directly: "Which vendors do you have financial relationships with?" Any hesitation or evasion is informative. The best advisors disclose these relationships proactively because they understand that transparency is the only way to maintain credibility when conflicts exist.
Evaluating their track record without relying on testimonials
What to look for in their public thinking
Testimonials are curated. No advisor publishes the negative ones. Instead, read what they write and say publicly. Look for specificity over generality. An advisor who writes "companies should adopt AI strategically" is saying nothing. An advisor who writes about specific failure modes in retrieval-augmented generation pipelines, or who analyses why a particular architectural pattern breaks at scale, is demonstrating genuine expertise.
Look for intellectual honesty. Do they acknowledge when a technology they previously advocated turned out to be less capable than expected? Do they update their views as the field evolves, or do they maintain the same position regardless of new evidence?
The MIT Sloan research on AI hype cycles notes that a significant gap exists in understanding societal reception and adaptation to generative AI tools. An advisor who acknowledges uncertainty and complexity rather than projecting false confidence is likely to give you better guidance.
How they talk about failure and technical debt
Listen for how candidates discuss projects that didn't work. An advisor who presents an unblemished record is either lying, inexperienced, or has only taken safe engagements. The AI field moves too fast and the technical risks are too real for anyone with meaningful experience to have avoided failure entirely.
More telling than the failure itself is how they analyse it. Do they blame external factors (the client's data was bad, the team wasn't committed) or do they identify what they could have done differently? The best advisors have specific, uncomfortable stories about recommendations they made that turned out to be wrong, and they can articulate what they learned.
Research on AI deployment in public systems found that two technically viable AI systems in education reached working prototypes but couldn't advance to broader rollout due to institutional rather than technical reasons. An advisor who has encountered similar situations and can discuss what they'd do differently demonstrates the kind of systemic thinking that prevents expensive failures.
Asking for the engagement that went wrong
Make this a standard part of your evaluation process. Ask: "Tell me about an advisory engagement that didn't deliver the expected results. What happened and what would you do differently?"
The quality of the answer matters more than the content of the failure. You're evaluating self-awareness, analytical rigour, and honesty. An advisor who can't or won't answer this question is one you should pass on.
Moving from selection to productive engagement
Setting the terms before the first session
Before your advisor's first billable hour, establish clarity on four points: what decisions they're being hired to inform, what information they'll need access to, how you'll measure whether the engagement is productive, and what happens if it isn't working.
The MIT Sloan 2026 AI survey found that 38% of large enterprises have appointed a chief AI officer or equivalent role, but there is little consensus on reporting structure. This same ambiguity can plague advisory relationships. Your advisor needs to know who they report to, whose time they can request, and what authority their recommendations carry. Without this clarity, even brilliant advice gets lost in organisational friction.
How to know within 90 days whether it's working
Set a 90-day evaluation checkpoint before the engagement begins, not after. Define two or three specific outcomes you expect by that point. These shouldn't be "delivered a strategy document" (that's activity, not outcome) but rather "helped us make the build-vs-buy decision on our recommendation engine with a clear technical rationale" or "identified and deprioritised two AI initiatives that weren't aligned with our product strategy."
Harvard Business Review's portfolio approach to AI investment management offers a useful frame here: without a systematic way to decide where to start, how fast to move, and when to stop, AI efforts become a drain rather than a source of advantage. Your advisor should be helping you make those decisions with more confidence and precision than you could alone. If after 90 days your decision-making quality hasn't measurably improved, the engagement isn't working regardless of how impressive the advisor's credentials are.
The test is simple. Are you making better technical decisions faster? If yes, the advisor is earning their fee. If no, it's time for a direct conversation about what needs to change, or whether to part ways.
Hiring the right AI advisor is one of the highest-leverage decisions a startup founder can make, and one of the easiest to get wrong. The questions in this article are designed to filter for the rare combination of technical depth, strategic judgement, and intellectual honesty that separates genuinely valuable advisors from credentialed commentators. If you're ready to work with a team that builds AI solutions exploiting the full technical potential most companies never reach, rather than implementing surface-level features, get in touch with Agathon.
References
- How to Break the AI Hype Cycle and Make Good AI Decisions for Your Organization
- Action Items for AI Decision Makers in 2026
- Hype and Adoption of Generative Artificial Intelligence Applications
- Build, Buy, or Adopt Generative AI in Digital Procurement
- AI-Assisted Software Engineering: Rewriting the Build Versus Buy Playbook
- The End of One-Size-Fits-All Enterprise Software
- Let's Hold Consultants Accountable for Results
- 7 Factors That Drive Returns on AI Investments, According to a New Survey
- Beyond Model Readiness: Institutional Readiness for AI Deployment in Public Systems
- Manage Your AI Investments Like a Portfolio



