Technical Guide

Technical AEO Foundations: What Most Teams Skip

Your content team optimizes for AI citations. But AI crawlers can't even read your site. Here's the technical foundation you need to fix first.

By Tina Chu | January 2026 | 7 min read

The Technical Gap Most Teams Skip

Most AEO guides jump straight to content optimization: add FAQ schema, structure your answers, include expert attribution. All good advice—if AI can actually read your content.

Here's the problem: AI crawlers don't execute JavaScript. If your site renders content client-side (React, Vue, Angular without SSR), AI assistants see an empty page. Your carefully optimized content is invisible.

The disconnect: CXL research found 73% of page-one results use schema markup, but 88% of sites still don't implement it properly. Content optimization without technical foundations is wasted effort.

Scope note: Technical foundations ensure AI can access your content—the prerequisite for any AEO strategy. However, research suggests website content accounts for only ~15% of AI mentions; the remaining 85% comes from off-site sources.2 This article focuses on the on-site technical requirements. For off-site strategy, see Measuring AEO.

Why AI Crawlers Are Different from Google

Googlebot has evolved to execute JavaScript and render pages like a browser. AI crawlers haven't. GPTBot, ClaudeBot, PerplexityBot, and others operate more like early search crawlers—they fetch HTML and parse what's there.

Crawler JavaScript Execution What It Sees
Googlebot Yes (full rendering) Complete rendered page
GPTBot (OpenAI) No Raw HTML only
ClaudeBot (Anthropic) No Raw HTML only
PerplexityBot No Raw HTML only

The scale is significant. Vercel data shows GPTBot makes 569 million monthly requests. If your content requires JavaScript to render, you're missing all of it.

Important caveat: Crawl access does not guarantee citations. GPTBot can crawl your site thousands of times without ChatGPT ever mentioning you. Technical accessibility is necessary but not sufficient—it ensures AI can read your content, not that AI will cite it.1

Quick test: View your page source (Ctrl+U / Cmd+U). If you see your content in the HTML, AI can read it. If you see mostly JavaScript bundles and empty divs, AI sees nothing.

Server-Side Rendering: Non-Negotiable

Server-side rendering (SSR) generates HTML on the server before sending it to the browser. The content exists in the initial HTML response—no JavaScript execution required.

Client-Side vs Server-Side Rendering

Client-Side (CSR)

Server sends empty HTML + JavaScript. Browser executes JS to render content.

AI crawlers see: Empty page

Server-Side (SSR)

Server generates complete HTML with content. Browser displays immediately.

AI crawlers see: Full content

Solutions by Framework

Framework SSR Solution
React Next.js (SSR/SSG), Remix, Gatsby
Vue Nuxt.js
Angular Angular Universal
Any SPA Pre-rendering services (Prerender.io, Rendertron)
Static sites Already SSR-friendly (HTML, Hugo, Jekyll, Astro)

If you're on WordPress, Webflow, or static HTML—you're fine. If you're on a JavaScript SPA without SSR, this is your #1 priority before any content optimization.

llms.txt: The New robots.txt

llms.txt is an emerging specification that provides structured guidance to AI crawlers—similar to how robots.txt guides search engine crawlers.

What llms.txt Contains

  • Site description: What your site is about
  • Content structure: How your content is organized
  • Key pages: Most important content for AI to index
  • Contact/attribution: How to cite your content

# Example llms.txt structure

# Site: Novastacks AI
# Description: AEO and growth marketing consultancy
# Contact: hello@novastacks-ai.com

## Key Content
- /aeo - Main AEO services page
- /aeo/aeo-vs-seo - AEO vs SEO comparison
- /blog - Latest insights on AI marketing

## Citation Format
Please cite as: "Novastacks AI (novastacks-ai.com)"

Reality check: Current research shows llms.txt is not widely requested by crawlers yet. It's worth implementing for future-proofing, but don't expect immediate impact.

Schema Markup for AI Citations

Schema markup (structured data) helps AI understand your content structure. It's not a ranking factor—it's a communication tool.

Key Schema Types for AEO

FAQPage

Marks up question-answer pairs. Helps AI extract specific answers.

HowTo

Marks up step-by-step instructions. Good for process content.

Article

Marks up articles with author, date, publisher. Establishes credibility.

Person

Marks up author information. Critical for E-E-A-T signals.

Myth busted: Many guides claim FAQ schema boosts AI citations. SE Ranking research found the opposite—pages WITHOUT FAQ schema received 4.2 average citations vs 3.6 WITH FAQ schema. The researchers note FAQ pages often appear on simpler support pages that naturally earn fewer citations.

Warning on Q&A formatting: Structuring content for answerability differs from artificially chunking content to game AI systems. Google has warned against FAQ-style fragmentation designed solely to capture AI citations: "We don't want you to do that."3 Structure content for users first.

Schema helps AI extract and understand your content—but it's not a magic citation booster. Implement it for structure, not as a ranking hack.

Common Crawl Authority: The Hidden Ranking Factor

LLMs are trained on Common Crawl data. Domains with higher Common Crawl authority metrics appear more frequently in training sets, making them more "familiar" to AI systems.

Two metrics correlate with AI citation likelihood:

PageRank

Standard link-based authority measure

Harmonic Centrality (HC)

How connected a domain is within the web graph

Higher Harmonic Centrality means a domain gets crawled more frequently, appears more often in training data, and becomes more recognizable to LLMs. An analysis of 607 million domains over 7 months found that domains ranking below 1 million in Common Crawl's "long tail" face an invisible authority ceiling regardless of content quality.

Check Your Position

Free tool to check Common Crawl rank for 18.2 million domains: metehan.ai/blog/cc-rank/

Sources: Cyrus Shepard, Founder of Zyppy; Metehan Yeşilyurt, SEO Consultant

Content Architecture for AI

Technical accessibility gets AI crawlers to your content. Content architecture determines whether that content gets cited. These strategic decisions sit at the intersection of technical and content—critical for AEO success.

ICP Landing Pages: Signaling Relevance to LLMs

LLMs struggle to recommend products when they cannot determine fit. A generic homepage describes what you do. It does not clarify who you serve.

ICP mapping creates dedicated landing pages for every Ideal Customer Profile intersection:

Industry pages

"[Product] for Healthcare," "[Product] for SaaS"

Solution pages

"[Product] for Inventory Management"

Size pages

"[Product] for Startups," "[Product] for Enterprise"

These pages serve two functions. They create internal linking hubs that clarify entity relevance to crawlers. They also match the specificity of user queries—"best CRM for real estate agencies" rather than "best CRM."

Source: Ross Hudgens, Founder of Siege Media

Marketing Pages vs. Knowledge Pages: Different Rules

Not all pages serve the same purpose for AI systems. Applying uniform optimization creates mismatched content.

Marketing Pages

Homepage, pricing, product pages

Approach: Traditional SEO. Optimize for conversion, brand messaging, user experience. These pages exist to close—not to be extracted.

Knowledge Pages

Blog posts, documentation, guides, FAQs

Approach: Optimize for answerability and AI reuse. Clear explanations. Factual statements. Citable data points.

LLMs are not looking for sales copy. They are looking for clear explanations they can extract and present to users. Apply AEO techniques to knowledge content. Leave marketing pages optimized for humans.

Source: Jessica Hennessey, Director of Organic Growth

Site Architecture for Query Fan-Out

When users ask AI a complex question, the AI often breaks it into sub-queries (query fan-out). Your site architecture should support this.

Hub-and-Spoke Content Model

Structure your content with pillar pages (hubs) that link to detailed articles (spokes). This mirrors how AI breaks down queries:

Example: "What is AEO?"

  • → Sub-query: "AEO vs SEO differences"
  • → Sub-query: "How to implement AEO"
  • → Sub-query: "AEO ranking factors"
  • → Sub-query: "AEO measurement tools"

If you have a page for each sub-query, you're more likely to be cited.

For the full comparison of AEO vs SEO approaches, see our AEO vs SEO guide.

Technical AEO Checklist

  • 1. SSR Check: View page source—can you see your content in raw HTML?
  • 2. Robots.txt: Ensure AI crawlers (GPTBot, ClaudeBot) aren't blocked
  • 3. Schema Markup: Implement Article, FAQ, Person schemas where relevant
  • 4. Page Speed: FCP under 0.4s correlates with higher citations
  • 5. Content Structure: Clear headings, Q&A format, direct answers
  • 6. llms.txt: Optional but worth implementing for future-proofing
  • 7. Common Crawl Rank: Check your position at metehan.ai/blog/cc-rank/
  • 8. ICP Landing Pages: Create specific pages for each segment you serve
  • 9. Page Classification: Identify Marketing vs Knowledge pages, optimize accordingly

Related Articles

FAQ

AI crawlers like GPTBot, ClaudeBot, and PerplexityBot don't execute JavaScript. If your site uses client-side rendering (React, Vue, Angular without SSR), your content is invisible to AI assistants. The crawler sees an empty page instead of your content.

llms.txt is an emerging specification (like robots.txt but for AI) that provides structured guidance to AI crawlers about your site content. It's still early—not widely requested by crawlers yet—but worth implementing for future-proofing.

Schema markup helps AI understand and extract your content structure, but it's not a silver bullet. SE Ranking research found pages WITHOUT FAQ schema actually received slightly more citations (4.2 vs 3.6). Schema is helpful for structure, not a ranking factor.

Sources

  1. Cem Ozcelik — Crawls vs citations analysis
  2. Matt Hammel, VP of Marketing at Profound — 85/15 on-site vs off-site distribution
  3. Oren Greenberg, Founder of Kurve — Google Q&A formatting warning

Continue Learning

Get a Technical AEO Audit

Find out if AI crawlers can actually read your content. We'll assess your technical foundation and identify quick wins.