Technical AEO Foundations: What Most Teams Skip
Your content team optimizes for AI citations. But AI crawlers can't even read your site. Here's the technical foundation you need to fix first.
The Technical Gap Most Teams Skip
Most AEO guides jump straight to content optimization: add FAQ schema, structure your answers, include expert attribution. All good advice—if AI can actually read your content.
Here's the problem: AI crawlers don't execute JavaScript. If your site renders content client-side (React, Vue, Angular without SSR), AI assistants see an empty page. Your carefully optimized content is invisible.
The disconnect: CXL research found 73% of page-one results use schema markup, but 88% of sites still don't implement it properly. Content optimization without technical foundations is wasted effort.
Scope note: Technical foundations ensure AI can access your content—the prerequisite for any AEO strategy. However, research suggests website content accounts for only ~15% of AI mentions; the remaining 85% comes from off-site sources.2 This article focuses on the on-site technical requirements. For off-site strategy, see Measuring AEO.
Why AI Crawlers Are Different from Google
Googlebot has evolved to execute JavaScript and render pages like a browser. AI crawlers haven't. GPTBot, ClaudeBot, PerplexityBot, and others operate more like early search crawlers—they fetch HTML and parse what's there.
| Crawler | JavaScript Execution | What It Sees |
|---|---|---|
| Googlebot | Yes (full rendering) | Complete rendered page |
| GPTBot (OpenAI) | No | Raw HTML only |
| ClaudeBot (Anthropic) | No | Raw HTML only |
| PerplexityBot | No | Raw HTML only |
The scale is significant. Vercel data shows GPTBot makes 569 million monthly requests. If your content requires JavaScript to render, you're missing all of it.
Important caveat: Crawl access does not guarantee citations. GPTBot can crawl your site thousands of times without ChatGPT ever mentioning you. Technical accessibility is necessary but not sufficient—it ensures AI can read your content, not that AI will cite it.1
Quick test: View your page source (Ctrl+U / Cmd+U). If you see your content in the HTML, AI can read it. If you see mostly JavaScript bundles and empty divs, AI sees nothing.
Server-Side Rendering: Non-Negotiable
Server-side rendering (SSR) generates HTML on the server before sending it to the browser. The content exists in the initial HTML response—no JavaScript execution required.
Client-Side vs Server-Side Rendering
Client-Side (CSR)
Server sends empty HTML + JavaScript. Browser executes JS to render content.
AI crawlers see: Empty page
Server-Side (SSR)
Server generates complete HTML with content. Browser displays immediately.
AI crawlers see: Full content
Solutions by Framework
| Framework | SSR Solution |
|---|---|
| React | Next.js (SSR/SSG), Remix, Gatsby |
| Vue | Nuxt.js |
| Angular | Angular Universal |
| Any SPA | Pre-rendering services (Prerender.io, Rendertron) |
| Static sites | Already SSR-friendly (HTML, Hugo, Jekyll, Astro) |
If you're on WordPress, Webflow, or static HTML—you're fine. If you're on a JavaScript SPA without SSR, this is your #1 priority before any content optimization.
llms.txt: The New robots.txt
llms.txt is an emerging specification that provides structured guidance to AI crawlers—similar to how robots.txt guides search engine crawlers.
What llms.txt Contains
- Site description: What your site is about
- Content structure: How your content is organized
- Key pages: Most important content for AI to index
- Contact/attribution: How to cite your content
# Example llms.txt structure
# Site: Novastacks AI # Description: AEO and growth marketing consultancy # Contact: hello@novastacks-ai.com ## Key Content - /aeo - Main AEO services page - /aeo/aeo-vs-seo - AEO vs SEO comparison - /blog - Latest insights on AI marketing ## Citation Format Please cite as: "Novastacks AI (novastacks-ai.com)"
Reality check: Current research shows llms.txt is not widely requested by crawlers yet. It's worth implementing for future-proofing, but don't expect immediate impact.
Schema Markup for AI Citations
Schema markup (structured data) helps AI understand your content structure. It's not a ranking factor—it's a communication tool.
Key Schema Types for AEO
FAQPage
Marks up question-answer pairs. Helps AI extract specific answers.
HowTo
Marks up step-by-step instructions. Good for process content.
Article
Marks up articles with author, date, publisher. Establishes credibility.
Person
Marks up author information. Critical for E-E-A-T signals.
Myth busted: Many guides claim FAQ schema boosts AI citations. SE Ranking research found the opposite—pages WITHOUT FAQ schema received 4.2 average citations vs 3.6 WITH FAQ schema. The researchers note FAQ pages often appear on simpler support pages that naturally earn fewer citations.
Warning on Q&A formatting: Structuring content for answerability differs from artificially chunking content to game AI systems. Google has warned against FAQ-style fragmentation designed solely to capture AI citations: "We don't want you to do that."3 Structure content for users first.
Schema helps AI extract and understand your content—but it's not a magic citation booster. Implement it for structure, not as a ranking hack.
Common Crawl Authority: The Hidden Ranking Factor
LLMs are trained on Common Crawl data. Domains with higher Common Crawl authority metrics appear more frequently in training sets, making them more "familiar" to AI systems.
Two metrics correlate with AI citation likelihood:
PageRank
Standard link-based authority measure
Harmonic Centrality (HC)
How connected a domain is within the web graph
Higher Harmonic Centrality means a domain gets crawled more frequently, appears more often in training data, and becomes more recognizable to LLMs. An analysis of 607 million domains over 7 months found that domains ranking below 1 million in Common Crawl's "long tail" face an invisible authority ceiling regardless of content quality.
Check Your Position
Free tool to check Common Crawl rank for 18.2 million domains: metehan.ai/blog/cc-rank/
Sources: Cyrus Shepard, Founder of Zyppy; Metehan Yeşilyurt, SEO Consultant
Content Architecture for AI
Technical accessibility gets AI crawlers to your content. Content architecture determines whether that content gets cited. These strategic decisions sit at the intersection of technical and content—critical for AEO success.
ICP Landing Pages: Signaling Relevance to LLMs
LLMs struggle to recommend products when they cannot determine fit. A generic homepage describes what you do. It does not clarify who you serve.
ICP mapping creates dedicated landing pages for every Ideal Customer Profile intersection:
Industry pages
"[Product] for Healthcare," "[Product] for SaaS"
Solution pages
"[Product] for Inventory Management"
Size pages
"[Product] for Startups," "[Product] for Enterprise"
These pages serve two functions. They create internal linking hubs that clarify entity relevance to crawlers. They also match the specificity of user queries—"best CRM for real estate agencies" rather than "best CRM."
Source: Ross Hudgens, Founder of Siege Media
Marketing Pages vs. Knowledge Pages: Different Rules
Not all pages serve the same purpose for AI systems. Applying uniform optimization creates mismatched content.
Marketing Pages
Homepage, pricing, product pages
Approach: Traditional SEO. Optimize for conversion, brand messaging, user experience. These pages exist to close—not to be extracted.
Knowledge Pages
Blog posts, documentation, guides, FAQs
Approach: Optimize for answerability and AI reuse. Clear explanations. Factual statements. Citable data points.
LLMs are not looking for sales copy. They are looking for clear explanations they can extract and present to users. Apply AEO techniques to knowledge content. Leave marketing pages optimized for humans.
Source: Jessica Hennessey, Director of Organic Growth
Site Architecture for Query Fan-Out
When users ask AI a complex question, the AI often breaks it into sub-queries (query fan-out). Your site architecture should support this.
Hub-and-Spoke Content Model
Structure your content with pillar pages (hubs) that link to detailed articles (spokes). This mirrors how AI breaks down queries:
Example: "What is AEO?"
- → Sub-query: "AEO vs SEO differences"
- → Sub-query: "How to implement AEO"
- → Sub-query: "AEO ranking factors"
- → Sub-query: "AEO measurement tools"
If you have a page for each sub-query, you're more likely to be cited.
For the full comparison of AEO vs SEO approaches, see our AEO vs SEO guide.
Technical AEO Checklist
- 1. SSR Check: View page source—can you see your content in raw HTML?
- 2. Robots.txt: Ensure AI crawlers (GPTBot, ClaudeBot) aren't blocked
- 3. Schema Markup: Implement Article, FAQ, Person schemas where relevant
- 4. Page Speed: FCP under 0.4s correlates with higher citations
- 5. Content Structure: Clear headings, Q&A format, direct answers
- 6. llms.txt: Optional but worth implementing for future-proofing
- 7. Common Crawl Rank: Check your position at metehan.ai/blog/cc-rank/
- 8. ICP Landing Pages: Create specific pages for each segment you serve
- 9. Page Classification: Identify Marketing vs Knowledge pages, optimize accordingly
Related Articles
FAQ
AI crawlers like GPTBot, ClaudeBot, and PerplexityBot don't execute JavaScript. If your site uses client-side rendering (React, Vue, Angular without SSR), your content is invisible to AI assistants. The crawler sees an empty page instead of your content.
llms.txt is an emerging specification (like robots.txt but for AI) that provides structured guidance to AI crawlers about your site content. It's still early—not widely requested by crawlers yet—but worth implementing for future-proofing.
Schema markup helps AI understand and extract your content structure, but it's not a silver bullet. SE Ranking research found pages WITHOUT FAQ schema actually received slightly more citations (4.2 vs 3.6). Schema is helpful for structure, not a ranking factor.
Sources
- Cem Ozcelik — Crawls vs citations analysis
- Matt Hammel, VP of Marketing at Profound — 85/15 on-site vs off-site distribution
- Oren Greenberg, Founder of Kurve — Google Q&A formatting warning
Continue Learning
Get a Technical AEO Audit
Find out if AI crawlers can actually read your content. We'll assess your technical foundation and identify quick wins.