AI Engines Domain Diversity Compared: Claude vs ChatGPT vs Perplexity vs Gemini (Univ. Toronto Findings)

AI Engines Domain Diversity Compared: Claude vs ChatGPT vs Perplexity vs Gemini (Univ. Toronto Findings)

The University of Toronto study by Chen et al. (arXiv:2509.08919, 2025) ran the same prompts across Claude, ChatGPT, Perplexity, and Gemini and quantified domain diversity, set overlap, and unique-domain shares. The headline finding: pairwise Jaccard overlap between engines is consistently in the 0.10 to 0.25 range. Each AI engine samples a substantially different evidence pool. Below: the exact diversity metrics, what they imply for cross-engine GEO strategy, and how to optimize for multi-engine visibility instead of betting on a single platform.

TL;DR: In the Toronto automotive vertical, Claude returned 350 distinct domains, ChatGPT 212, Perplexity 347. Pairwise Jaccard: Claude-ChatGPT 0.147, Claude-Perplexity 0.251, ChatGPT-Perplexity 0.096. Unique-domain shares: Claude 50.3%, ChatGPT 60.8%, Perplexity 56.5%. Local services were even more fragmented. The strategic implication: never optimize for just one engine; the multi-engine surface area is the actual GEO playing field.

Free CapstonAI scan →    GEO Research hub

The diversity numbers by vertical and engine

Vertical Engine Distinct domains Unique-share % Top pairwise Jaccard
Automotive Claude 350 50.3% 0.251 (vs. Perplexity)
Automotive ChatGPT 212 60.8% 0.147 (vs. Claude)
Automotive Perplexity 347 56.5% 0.251 (vs. Claude)
Auto Repair (local) Claude 51 35% ~0.20
Auto Repair (local) ChatGPT 53 62% ~0.15
Auto Repair (local) Gemini 98 55% ~0.25 (vs. Perplexity)
Auto Repair (local) Perplexity 117 63% ~0.25 (vs. Gemini)
Dentists (local) Gemini 151 54% 0.23 (vs. Perplexity)
Dentists (local) Perplexity 138 54% 0.23 (vs. Gemini)
Dentists (local) Claude 64 40% ~0.15
Dentists (local) ChatGPT 51 68% ~0.10

Source: Chen et al., 2025, §5.2.4-5.2.5.

The 7-step multi-engine GEO playbook

  1. Step 1: Build an engine-specific source map. For your top 30 prompts, run them through all four engines and log the cited domains. Cluster the result into: (a) cross-engine universal sources (cited by all 4), (b) engine-pair sources (cited by 2-3 engines), (c) engine-unique sources. The universal cluster is small and high-leverage; the engine-unique cluster is large and requires engine-specific tactics.
  2. Step 2: Prioritize cross-engine universal domains. A handful of domains (Wikipedia, top review aggregators, Reuters, Forbes) appear across all four engines. Earning placements in these domains generates compounding citations across the entire AI Search surface.
  3. Step 3: Build engine-pair coverage for the next tier. The Toronto data shows Claude-Perplexity overlap is the highest pair (0.251 in automotive). Sources that appear in both (Car and Driver, Consumer Reports for autos) are second-tier high-leverage targets.
  4. Step 4: For Gemini and Perplexity, invest in domain breadth. Both engines return 100-150+ distinct domains in local categories. Coverage breadth (more vertical aggregators, more directory listings) matters more than depth.
  5. Step 5: For Claude and ChatGPT, invest in domain authority depth. Both engines return narrower sets (50-65 distinct domains in local). A few authority domains drive most citations. Optimize for those.
  6. Step 6: Track ChatGPT’s high unique-share carefully. ChatGPT’s unique-share is 60-68% in the Toronto data. Most ChatGPT citations come from sources no other engine cites. This means ChatGPT-specific PR strategy is required; you cannot rely on Claude/Perplexity coverage spilling over.
  7. Step 7: Report diversity metrics quarterly. Track: (a) % of prompts where you appear in 4 engines, 3 engines, 2 engines, 1 engine, 0 engines; (b) engine-specific Jaccard between your citation set and competitor sets; (c) trend in cross-engine universal-domain coverage. These three metrics quantify your multi-engine moat.

What the diversity gap means strategically

Most GEO advice assumes that what works on one engine works on all. The Toronto data shows this is empirically false. ChatGPT-Perplexity Jaccard of 0.096 in automotive means more than 90% of cited domains differ between the two engines. A brand optimized for ChatGPT through Wikipedia, AP, Reuters placements will appear in ChatGPT outputs but barely register on Perplexity, which prioritizes YouTube and Car and Driver.

The strategic implication is a portfolio approach. Allocate PR and content budget across the engine-specific source maps in proportion to: (a) your target buyer’s engine usage by vertical, (b) the influence weight of each source on its native engine, (c) the marginal cost of earning a placement. CapstonAI partner cohort data Q1 2026 shows that brands tracking and optimizing for all 4 engines see 2.2x higher total citation share than brands optimizing for ChatGPT alone.

Common errors with multi-engine GEO

  • Treating “AI Search” as a monolith. Each engine has distinct sourcing logic, biases, and authority signals. Use engine-specific dashboards.
  • Optimizing only for ChatGPT. ChatGPT may have the largest user base, but Perplexity, Claude, and Gemini are growing fast and have different source preferences. Multi-engine coverage future-proofs your visibility.
  • Skipping local-specific tactics. Local categories have the most fragmented domain ecosystems. Vertical aggregators (HomeAdvisor, Yelp, Healthgrades) and directory sites disproportionately matter here.
  • Ignoring Gemini’s brand-leaning behavior. Gemini is the most brand-friendly engine in the Toronto data. If you have strong brand-owned content, Gemini may already be citing you; don’t deprioritize it.
  • Failing to track Jaccard drift. Engine source preferences shift quarterly. A source that drove 30% of Claude citations in Q1 may drop to 12% by Q3. Quarterly re-baselining is mandatory.

FAQ — AI engine domain diversity

Which engine should we prioritize if we can only invest in one?

Depends on buyer behavior. B2B SaaS and enterprise buyers tilt toward ChatGPT and Perplexity. D2C and search-replacement queries tilt toward Google AI Overview/Gemini. Specialized research tilts toward Claude (where reasoning quality is highest). For most CapstonAI cohort customers, ChatGPT is the highest-volume engine but Perplexity has the highest engagement-to-conversion ratio.

How do we measure cross-engine coverage in practice?

Build a 30-50 prompt panel covering your top buyer questions. Run it across all 4 engines weekly. Compute citation rate per engine, cross-engine appearance rate, and Jaccard similarity to top competitors. The CapstonAI platform automates this in a Looker Studio dashboard with weekly email exports.

Does engine diversity matter more in some industries?

Yes. Local services have the most engine fragmentation (Jaccard often below 0.1) and require the most engine-specific strategy. Global enterprise SaaS has the highest convergence (Jaccard 0.3-0.5) because all engines cite the same handful of authority publications. Plan engine portfolio investment accordingly.

Tools and related reading

Ready to build multi-engine GEO coverage?

Free CapstonAI scan →

Last updated: May 2026. Primary source: Chen, M., Wang, X., Chen, K., & Koudas, N. (2025). Generative Engine Optimization: How to Dominate AI Search. University of Toronto. arXiv:2509.08919. https://arxiv.org/abs/2509.08919