Foundations · 28 min read

How to Get Cited by LLMs: The Complete Taxonomy of GEO Methods

Q: Does llms.txt actually do anything if no major vendor has confirmed they read it?

The cost of shipping a well-formed llms.txt is roughly ten minutes; the worst case is that no agent reads it. The best case is that you are correctly indexed by the growing long tail of RAG-as-a-service products that have adopted the spec. It is a positive expected-value bet at low cost. The actual evidence on adoption is still emerging; see our llms.txt audit for the 100-domain snapshot.

Every method GEO practitioners use to surface in ChatGPT, Claude, Perplexity, and AI Overviews — grouped into five families and rated by evidence quality. A synthesis of the published literature, vendor docs, and our own audits. The map of the discipline as of June 2026.

GeoSalience

·Published 19 May 2026·Updated 8 Jun 2026·View as Markdown (.md)

The five families, the headline finding, and the one method everyone underweights

GEO methods cluster into five families: technical foundations, content design, structural metadata, brand and authority signals, and measurement.
The strongest published evidence (Aggarwal et al., 2024) shows that citing primary sources, adding statistics, and inline quotations improve a source's visibility in generated answers by up to ~40% on the paper's metrics, across ChatGPT, Perplexity, and BingChat — a larger gain than any technical change they tested.
The single most underweighted method on practitioner blogs is answer-first chunking — writing the first 1–2 sentences of every H2 so that they can be extracted and quoted on their own, without context from the rest of the page.
Tactics that sound high-leverage but show weak public evidence as of May 2026: blanket llms.txt adoption with no link list, FAQ schema on every page, and "AI-friendly" copywriting tricks that do not change information density.
Treat this taxonomy as a working map, not a ranking. Run the 30-day implementation roadmap at the end of this guide and measure on your own data — every citation engine is a moving target.

The Generative Engine Optimization (GEO) literature is young. The first paper to use the term was published in late 2023; the first round of trade-press how-to articles arrived in 2024; the first batch of measurement tools (Profound, Peec, Otterly, Athena HQ) shipped between 2024 and 2026. In a discipline this new, the inventory of methods practitioners actually use is more useful than another ranked listicle of "top tactics".

This guide is that inventory, and the map for the Foundations pillar of this site. We have grouped every method into five families, attached the public evidence to each, flagged the antipatterns we see most often, and closed with an implementation order you can run in a month. We will revisit the entries as we publish our own primary research over the next two quarters — first published 2026-05-19, last reviewed 2026-06-08.

If you only have five minutes, read the callout above and the Evidence Matrix near the bottom. The map of the territory matters more than any single method.

The map: five families of GEO methods

Every method that earns a place in this guide is a deliberate change to one of five layers of a website. We group them this way because the layers correspond to different stages of how an LLM ingests, retrieves, and renders your content.

Family	What it changes	Who reads the signal	Time to effect
Technical foundations	The bytes your server returns	Crawlers, parsers, RAG indexers	Hours–days
Content design	The information density on each page	The generator at answer time	Days–weeks
Structural metadata	The relationships between pages	Crawlers, knowledge-graph builders	Days–weeks
Brand and authority signals	The web's opinion of your site	Pre-training filters, ranking models	Weeks–months
Measurement and iteration	Your ability to know what worked	You (and the next decision you make)	Continuous

The families are not a ladder you climb in order. They are five surfaces you should be working on in parallel, because they each feed a different decision the LLM is making about whether to cite you. The taxonomy that follows expands every family into the methods inside it — twenty-three in total at this revision.

Mental model: how LLMs choose which sources to cite

Before the methods, the mechanics. Different methods target different stages of the pipeline, and the only way to reason about leverage is to know which stage you are influencing. As of May 2026, a citation-capable LLM passes a query through three stages.

Stage 1 — Pre-training corpus inclusion

The base model was trained on a snapshot of the web. If your domain was in that snapshot, the model has some representation of it — terminology, common claims, brand entities. This stage is mostly out of your direct control on a short horizon; you cannot retroactively change what Common Crawl picked up. What you can do is increase the chance you will be in the next snapshot: clean HTML, indexable URLs, sufficient text-to-chrome ratio, and content the open web is willing to link to.

For most practitioners reading this guide, pre-training inclusion is a long-horizon investment. The next stage is where short-term wins live.

Stage 2 — Retrieval (RAG)

When a user asks a citation-capable LLM a question, the system runs a retrieval step — usually a hybrid of keyword and vector search against a live index — and selects a small number of source documents to ground the answer. This is the same pattern described by Lewis et al. (2020) in the original RAG paper, now productised across ChatGPT Browse, Perplexity, Claude with web search, Copilot, and AI Overviews.

Retrieval is the stage where most GEO leverage lives. The system needs to be able to find your page, parse it, decide it is relevant, and extract a useful chunk. Every method in the technical foundations, content design, and structural metadata families exists to make one of those four sub-steps work better for you.

Stage 3 — Grounded generation and citation

Once the model has retrieved a set of candidate sources, it generates an answer and decides which sources to attribute. The attribution decision is influenced by the model's training (what it was rewarded for in RLHF), the system prompt of the product surface (ChatGPT vs Perplexity vs AI Overviews behave differently), and the quality of the extractable chunks. A page that is high-relevance but unextractable — for example, key information locked inside an image — frequently fails to be cited even when it would have helped.

Brand and authority signals operate quietly across all three stages: they raise the prior probability the page is selected during retrieval, raise the model's confidence at generation time, and raise the chance the page ends up in the next training snapshot.

With the map and the mechanics in place, we can walk the methods.

Family 1 — Technical foundations

Technical methods change what your server returns when a crawler, parser, or RAG indexer fetches a URL. They are the cheapest family to implement (most are a few hours of work) and they are the easiest to verify (you can curl the result). They will not, on their own, make a weak page citable — but a strong page without them is leaving low-cost wins on the table.

Method 1 — Schema.org JSON-LD

Schema.org is a vocabulary maintained by Google, Microsoft, Yahoo, and Yandex for marking up the entities on a page. It is delivered as a JSON-LD block in the document <head> and consumed by every major crawler.

For an editorial site, the high-value types are Article (or NewsArticle, BlogPosting), Person for author bylines, Organization for the publisher, BreadcrumbList for site hierarchy, FAQPage where the article carries an actual FAQ block, and DefinedTerm for glossary entries. Aggarwal et al. (2024) did not isolate JSON-LD as a single test variable, so the direct citation-lift evidence is still mixed — but the indirect evidence is strong: pages with valid Article markup are more reliably parsed by RAG systems and are more frequently surfaced in AI Overviews, which inherits Google's existing dependence on structured data.

A minimal Article block, on a real published page:

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "How to Get Cited by LLMs",
  "datePublished": "2026-05-19",
  "dateModified": "2026-05-19",
  "author": {
    "@type": "Person",
    "name": "Jane Doe",
    "url": "https://example.com/authors/jane-doe"
  },
  "publisher": {
    "@type": "Organization",
    "name": "GeoSalience",
    "url": "https://example.com"
  },
  "mainEntityOfPage": "https://example.com/foundations/geo-methods-taxonomy"
}

Two practical rules: validate every page in validator.schema.org and Google's Rich Results Test before you call the work done, and never declare attributes that are not visibly true on the page — a FAQPage schema without a real FAQ block is treated as a quality signal against you. We walk the exact JSON-LD blocks we ship — Article, BreadcrumbList, Dataset, and FAQPage — in JSON-LD recipes for GEO.

Method 2 — `llms.txt` and `llms-full.txt`

llms.txt is a proposal by Jeremy Howard (llmstxt.org, 2024) for a Markdown-formatted index file at the root of your domain. Its job is to hand-curate which URLs on your site matter for LLM consumers, with one-line descriptions, organised by section. The companion llms-full.txt concatenates the full text of the most important pages.

The honest evidence picture, as of May 2026: no major LLM vendor has confirmed that llms.txt is in their crawler pipeline. The file is read by an unknown number of agents, including the long tail of RAG-as-a-service products. The cost of shipping one is roughly ten minutes of work for a small site, and the downside risk is zero. The cost-benefit is positive even at low adoption.

The failure mode we see most often is shipping an llms.txt with an H1 and a few sections but no actual link list under ## Docs or ## Optional. That defeats the file's purpose. A useful llms.txt is the link list. We covered the spec, the adoption audit, and a working setup in llms.txt: spec, adoption, setup.

Method 3 — Semantic HTML and document structure

LLM retrievers work on the parsed document, not the rendered DOM. Heading elements (<h1> through <h4>), proper <article> and <section> boundaries, real <table> markup, and accessible <figure>/<figcaption> pairs are not cosmetic — they are the structural anchors a retriever uses to slice your page into chunks.

Three rules that matter:

The first heading on every article page is <h1>, used once, with the headline. The model uses it as the canonical answer to "what is this page about". A page where the visible title is rendered as a <div> with a CSS class is invisible to many parsers.

Section headings should be answer-bearing, not navigational. <h2>How citation rate is calculated</h2> is more retrievable than <h2>The math</h2>, because the H2 itself is a quotable answer to a query.

Tables and lists should be real semantic markup, not divs styled to look like tables. The number of times we have seen "tabular" data shipped as nested divs is alarming — and every one of those tables is invisible to a chunker.

Method 4 — Markdown-as-source and `.md` aliases

A growing pattern, popularised by Anthropic's documentation, Vercel, and the Stripe API reference, is to serve a raw-Markdown version of every important page at a .md alias — for example, /foundations/what-is-geo and /foundations/what-is-geo.md.

The mechanism is simple: when an LLM or developer wants the canonical, parse-friendly version of the page, they fetch the .md URL and get the source text without HTML chrome, JavaScript, or rendering quirks. The page is identical in content, but cheaper to ingest. Several RAG products (notably those serving developer documentation) preferentially fetch the .md alternate when one is advertised in <link rel="alternate" type="text/markdown" href="...">.

This is a cheap method. If you author in Markdown or MDX, you already have the source — exposing it as an alternate is a route handler. If you do not, the conversion cost is the main bottleneck.

Method 5 — Canonicalization, robots, sitemap, and Open Graph

The least exciting and most underweighted block of technical work. Each of the four pieces is doing a different job:

rel="canonical" collapses near-duplicate URLs into a single citation target. Without it, retrieval can split your authority across query-parameter variants, mobile subdomains, or AMP versions, and the LLM cites the version with the worst extractable content.

robots.txt is your contract with crawlers. The contemporary debate is whether to allow GPTBot, ClaudeBot, PerplexityBot, and Google-Extended — the line where you trade discoverability for control of your training-corpus inclusion. The honest answer in May 2026 is that blocking these crawlers measurably reduces citation rate, and the evidence for that is well-documented in publisher case studies after the 2024–2025 wave of New York Times-style opt-outs. If your strategy is to be cited, you want them allowed.

sitemap.xml is how you tell the crawler what you have. Generate it programmatically, include every public article, set lastmod accurately, and submit it via Google Search Console and Bing Webmaster. AI-native crawlers increasingly use it as a discovery primitive.

Open Graph (og:title, og:description, og:image) and Twitter Cards (twitter:card, twitter:image) drive social distribution, which drives backlinks, which feeds Family 4. Treat them as part of the technical baseline, even if their direct GEO effect is downstream.

Family 2 — Content design

Content methods change the information density of the page itself — the part the generator reads at answer time. They are the highest-leverage family for an established editorial site because they target Stage 3 directly: the model has already retrieved you and is now deciding whether your chunk is good enough to quote.

Method 6 — Answer-first H1 and dek

The H1 and the dek (subtitle) together are the unit the retriever uses to decide whether to surface your page at all. The rule, distilled from a year of looking at what gets cited and what does not:

The H1 should be the question or claim the page answers. Not a clever angle, not a hook, not a brand reference. If a reader asked "how does ChatGPT decide which source to cite?", the H1 should be a recognisable rewrite of that question — "How ChatGPT Decides Which Source to Cite" — not "The Citation Lottery".

The dek (subtitle, 1–2 sentences, ≤280 characters) compresses the article's specific finding into a sentence the model can quote verbatim. A template like "We ran [N] ChatGPT sessions across [M] niches and recorded every citation" is a citable sentence; "In this article, we'll explore how citation works" is filler.

We are testing this method on our own pages — the live protocol and dates are in our experiment log — and will publish the before/after result once the measurement window closes. Until then, treat the answer-first rule as well-supported by the published literature (Aggarwal et al., 2024) rather than by a result of ours.

Method 7 — Chunkability

Retrievers do not read your page; they read a chunk of it. The unit of citation is the chunk, and chunkability is a property of how easily a retriever can carve your page into self-contained, quotable pieces.

Concretely, chunkability improves when:

The first one or two sentences after each H2 stand on their own as an answer to the H2's implicit question. A reader (or model) who lands on just that section understands the point without scrolling up.

Paragraphs are short. Two to four sentences is the editorial sweet spot for screen reading, and it happens to coincide with the chunk-size sweet spot for most RAG systems (which target windows of roughly 150–500 tokens).

Lists, tables, and code blocks are real structured elements, not ASCII art. The retriever knows what a <ul> is; it does not know what to do with em-dashes in a wall of prose.

Pull quotes and standalone-sentence claims are intentional. A sentence that you want to be quoted should be its own paragraph, free of subordinate clauses, and built around a concrete number or named entity. The Aggarwal et al. (2024) testing of "quotation lift" was operating on exactly this property.

Method 8 — Primary research and methodology disclosure

This is the single highest-evidence method on the public record. Aggarwal et al. (2024) tested nine optimisation strategies on a held-out query set against ChatGPT, BingChat, and Perplexity. The strategies that moved the paper's visibility metric the most — by up to ~40% on some metrics — were citing sources, adding statistics, and adding quotations. Each of these is, fundamentally, a marker of primary or near-primary research.

The reason the effect is large is that primary research is information you cannot get elsewhere. When a model has to synthesise an answer and your page is the only source carrying a specific number, methodology, or quotation, you are not competing for citation — you are the only candidate.

The implementation cost is high. Primary research means running a test, collecting a dataset, writing a methodology section, and publishing the data alongside the article. The payoff is that one well-executed piece outperforms ten generic explainers on the same topic, and you have a defensible position when the next wave of generic explainers floods the topic.

If full primary research is out of reach for a given article, the next-best step is to write a transparent methodology paragraph anyway — "We compared X, Y, and Z by reading their public documentation and replicating the setup described in each" — because methodology disclosure itself is a citability signal.

Method 9 — Entity coverage and inline definitions

LLMs are entity-aware. The first time you mention "Schema.org", "RAG", "Perplexity", or any other proper noun on the page, you have an opportunity to define it inline in a sentence the model can quote on its own. Schema.org's DefinedTerm type makes the definition explicitly machine-readable.

Two practical heuristics:

A cornerstone article should mention at least 8–12 named entities relevant to its topic — people, papers, tools, concepts — each with a concrete reference or link. Sparse entity coverage correlates with weaker pre-training representation; dense entity coverage signals the page is about the topic in a way that matters for retrieval.

The first inline definition should be a complete sentence: "Retrieval-Augmented Generation (RAG) is the technique of fetching documents at query time and passing them to a language model as context, introduced by Lewis et al. (2020)." Not "RAG (defined below)", not "RAG, which we'll cover later".

Method 10 — Citation density to primary sources

Every important claim on the page should resolve to a primary source — a paper, an official document, a dataset, an announcement from the organisation responsible for the thing being described. Linking to a parahprasing blog post is a signal that you did not check the original, and increasingly, retrievers can detect the citation distance from your page to the closest primary source.

The practical version: for a 3,000-word article, expect 10–25 outbound primary-source citations. Each one should be a hyperlink in the body of the text (not a footnote, not a "Sources" appendix), and the anchor text should describe the cited work specifically, not "click here".

The anti-pattern we see most often: a "Sources" or "References" section at the bottom of the article with ten URLs and no inline anchoring. This is the artefact of articles written without checking the sources during writing, and retrievers treat it as such.

Family 3 — Structural metadata and discovery

Structural methods change the relationships between pages — how your site is wired together internally, and how the rest of the web points to it. These methods feed both the retriever (which uses link topology as a relevance signal) and the pre-training corpus (where co-citation patterns are a strong authority cue).

Method 11 — Internal linking topology and pillar architecture

A pillar-and-spoke architecture is the editorial form most retrievers reward. The pillar is a long, definitional article on a broad topic (this one). The spokes are narrower articles that each cover a sub-topic, link back to the pillar, and are linked from the pillar. The result is a small graph where every node is reachable in two or three clicks and every node has a clear topical neighbourhood.

Three concrete rules:

Every spoke article links to its pillar at least once in the body. Anchor text varies — sometimes the pillar's title, sometimes a sub-claim from it — to avoid the appearance of templated linking.

Every pillar article links to at least three of its strongest spokes from the body of the text, not just from a "See also" appendix. The links should be where a reader would naturally pivot to the deeper topic.

Orphan pages — articles with no inbound internal links — are an outright failure mode. They are functionally invisible to most retrievers. Run a periodic audit (we have a pnpm audit:links script in this codebase) to surface orphans and either link them or unpublish them.

Method 12 — External backlinks, co-citation, and the open-web graph

The pre-LLM SEO regime cared about backlinks. The LLM era still does, but the type of backlink that matters has shifted. Co-citation in Wikipedia, news outlets, academic papers, and .edu domains is more valuable than ever, because these are exactly the sources that LLMs are over-represented in during retrieval and pre-training.

The corollary: a single citation from a Wikipedia article is worth more than a hundred backlinks from low-authority directories. The methods that move this needle are slower than technical fixes — they look like contributing primary research that gets cited by others, publishing data that journalists use, and being the cleanest available source on a specific narrow topic.

The fastest-acting subspecies of this method is unlinked mentions. LLMs frequently associate brands with topics even when there is no hyperlink between the source mentioning the brand and the brand's own domain — and unlinked mentions in trade press, podcast transcripts, and conference talks contribute meaningfully to that association. Tracking mentions, not just backlinks, is an emerging measurement practice.

A separate but related method: the cover image and Open Graph card determine whether your article spreads on social — and social spread is the most reliable producer of the kind of backlinks and unlinked mentions that move Family 4. Treat the OG card as a piece of the article, not an afterthought.

Practical specs: 1200×630 pixels, under 200KB, type and image legible at thumbnail size, text in the image readable in both light and dark mode previews. Test the rendered card in opengraph.xyz or cards-dev.twitter.com before publishing.

Family 4 — Brand and authority signals

Brand methods change the web's opinion of your domain. They are the slowest-acting family and the hardest to fake. The methods here will not pay off in week one — but they are the difference between being cited occasionally for narrow technical queries and being cited routinely as a canonical source on a topic.

Method 14 — Verified author identity and Schema.org `Person`

Every article on an editorial site should have a named author with a real biography, a real public identity, and Schema.org Person markup that links the byline to the same identity across the web — LinkedIn, ORCID for academic authors, GitHub for technical authors, a professional website. The marker is sameAs, a list of URLs the author is known by elsewhere.

LLMs use authorship as an authority signal at multiple stages. At retrieval time, named-author articles outperform anonymous ones in domains where expertise matters (medical, legal, financial, technical). At generation time, the model is more willing to quote a source it can attribute to a specific named expert. At pre-training time, the entity-resolution pipeline links the author's articles across the web, building a stronger signal than any single piece of content could.

The anti-pattern: a generic "Staff" or "Editorial Team" byline. We have not yet seen a published study quantifying the penalty, but the directional evidence from search-era E-E-A-T research carries forward.

Method 15 — E-E-A-T in the LLM era

Google's E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) framework was a search-era construct, but the underlying signals — that the author has lived experience, demonstrated expertise, recognition by peers, and a trustworthy publishing history — are exactly the ones an LLM picks up on as well. The mechanism is different (entity resolution and corpus statistics, rather than a ranking model), but the signals are the same.

In practice, E-E-A-T translates to four things on an editorial site:

A "How we tested" or "Methodology" block on every research article, disclosing how the data was collected and what was not tested. This is exactly the editorial muscle that distinguishes a citable source from an opinion piece.

Visible author credentials — title, organisation, prior work — on every byline. The author's name, alone, is weaker than the name plus a recognisable affiliation.

A publication date and a last-reviewed date on every article. Articles without dates are aggressively down-weighted by retrievers, because the model cannot tell whether the information is current.

A correction and update policy that is visible and used. Errata are a strength signal, not a weakness signal — they tell the model that the source is being maintained.

Method 16 — Mention-building as a discipline

Mention-building is the active practice of getting your name, your work, and your data into the documents that LLMs over-represent: Wikipedia (carefully, following the project's conflict-of-interest rules), trade press, podcast transcripts, conference talks, GitHub READMEs, and the long tail of niche-authoritative blogs.

The work is unglamorous and slow. It looks like contributing data to a public dataset that journalists then cite, writing guest essays for trade publications, being a useful source for a journalist on a deadline, sponsoring a niche newsletter not for the link but for the editorial nod, and consistently showing up in the comment sections and forums where the topic is discussed. The compounding effect over twelve to twenty-four months is significant — and is the single most-cited factor in the case studies of brands that "appear out of nowhere" in LLM citations.

Family 5 — Measurement and iteration

Measurement is the family most practitioners skip — and the one without which every other family is faith-based. You cannot tell whether a method is working without a measurement baseline, and you cannot pick the next method to invest in without a sense of where the current gaps are.

Method 17 — Citation rate as the primary KPI

Citation rate is the share of relevant prompts in a defined battery where a given LLM cites your domain at least once. It is the single metric closest to the outcome you actually care about — being part of the answer to a query in your topical neighbourhood.

The formula:

Citation rate = (number of prompts in the battery where the engine cited your domain in the answer) ÷ (total number of prompts in the battery)

A battery is a set of prompts you have committed to monitoring on a recurring basis. For a typical editorial site, the right size is 50–200 prompts, covering the queries you would want to be cited for. Run the battery on a schedule (weekly is sufficient for most domains) and log the result per engine.

The trap to avoid: vanity prompts. If your battery consists only of prompts you already win, the metric is meaningless. Include 30–40% prompts where you currently lose, so the metric has room to move.

Share-of-Voice (SoV) is the share of cited sources, across a battery, attributable to your domain. Where citation rate asks "do they cite us at all?", SoV asks "of all the sources cited in this neighbourhood, what fraction is us?".

The formula:

Share of Voice = (number of citations to your domain across the battery) ÷ (total number of citations across the battery)

SoV is harder to move than citation rate, and the headline number is usually small (1–8% even for category-leading editorial sites). What matters is the trend, not the absolute value. We covered the measurement playbook in detail in Share of Voice — definition and method.

Method 19 — Cross-engine prompt batteries

A method that is doing well in ChatGPT may be doing poorly in Perplexity, AI Overviews, Claude, or Copilot, because each engine weights signals differently. The discipline is to run the same battery on every engine you care about and to track the divergence, not just the average.

Practically, this looks like:

A single CSV of prompts, versioned in the repo. Every entry has a prompt text, an intent label, and an expected-domain shortlist (the small set of domains it would be reasonable to cite for that prompt).

A weekly run that, for each prompt, queries every engine, parses the citation set, and writes a row per (prompt, engine, timestamp, cited_url, cited_position). Today this is most efficiently done with a combination of headless browser automation for engines without an API and the official APIs where they exist.

A small dashboard or notebook that, for each engine, shows citation rate and SoV over time, and lets you slice by intent and by content cluster.

Method 20 — Tools landscape (May 2026 snapshot)

The market for citation-tracking tools has matured. As of May 2026, the major options are:

Profound — enterprise, large prompt batteries, brand-tracking focus.

Peec AI — share-of-voice oriented, European, fast iteration.

Otterly.ai — affordable, prompt-monitoring focused, good for a single-brand setup.

Athena HQ — enterprise, AI-search-monitoring with workflow integrations.

Goodie — emerging, dashboards over agentic-search results.

A hands-on benchmark is in progress; until it ships, treat this list as a market map, not a ranking — it is alphabetical within category and reflects market presence as of May 2026, not endorsement. Either way, every tool's signal-to-noise depends on how well-defined your prompt battery is. Buying a tool without a battery is a recipe for dashboards no one trusts.

Method 21 — Manual prompt-testing protocol

You do not need a paid tool to start measuring. A reproducible manual protocol, run once a week for an hour, is enough to establish a baseline.

The minimum-viable protocol:

A fixed list of 50 prompts in a spreadsheet, versioned in your repo.

A fresh, signed-out browser session for each engine (ChatGPT, Claude, Perplexity, AI Overviews via Google Search).

For each prompt-engine pair: paste the prompt, wait for the answer, copy the cited URLs, paste them back into the spreadsheet under a date-stamped column.

A simple pivot at the end: citation rate per engine, citation rate per intent cluster, SoV.

The manual protocol is fragile (engines change UI, sessions personalise) but it is honest. It is the right starting point for the first 30–60 days while you decide whether to invest in tooling.

Method 22 — Pre/post measurement around deploys

The most underused application of citation tracking is attribution. When you ship a content change — a rewrite, a new schema block, a new internal-link cluster — measure the change on the same battery in a two-week window before and after the deploy. Most methods either move the needle or do not, and you find out which is which by running the battery, not by reading a blog post.

The instrumentation has to be ready before you ship. Pre-deploy battery, deploy, post-deploy battery, and a written note in the article's frontmatter (lastReviewedAt, updatedAt) that anchors the timeline.

The evidence matrix

Not every method has equal evidence. Here is our current read, as of 2026-06-08, drawn from the published literature, our own already-published audits, and the documented case studies from other practitioners. We will update the matrix as we publish more primary research.

Method	Family	Evidence quality	Implementation cost	Time to effect
1. Schema.org JSON-LD	Technical	Medium (indirect)	Low	Days
2. `llms.txt` / `llms-full.txt`	Technical	Low (emerging)	Low	Unknown
3. Semantic HTML	Technical	High	Low–medium	Days
4. Markdown alternates	Technical	Medium	Low	Days
5. Canonical / robots / sitemap / OG	Technical	High	Low	Days
6. Answer-first H1 + dek	Content	High (Aggarwal et al. 2024)	Low	Weeks
7. Chunkability	Content	High	Medium	Weeks
8. Primary research + methodology	Content	Very high (Aggarwal et al. 2024)	High	Weeks–months
9. Entity coverage + definitions	Content	Medium	Low	Weeks
10. Citation density to primary sources	Content	High (Aggarwal et al. 2024)	Medium	Weeks
11. Internal linking topology	Structural	Medium	Medium	Weeks
12. External backlinks + co-citation	Structural	High	Very high	Months
13. Cover images / OG cards	Structural	Indirect (via social)	Low	Days–weeks
14. Verified author identity	Authority	Medium (carry-over from SEO)	Medium	Weeks–months
15. E-E-A-T in the LLM era	Authority	Medium	High	Months
16. Mention-building	Authority	High (case studies)	Very high	Months
17. Citation rate KPI	Measurement	n/a (process)	Low	n/a
18. Share-of-Voice KPI	Measurement	n/a (process)	Low	n/a
19. Cross-engine batteries	Measurement	n/a (process)	Medium	n/a
20. Citation-tracking tools	Measurement	n/a (process)	Medium ($)	n/a
21. Manual protocol	Measurement	n/a (process)	Low	n/a
22. Pre/post deploy measurement	Measurement	n/a (process)	Medium	n/a
23. Methodology disclosure (standalone)	Content	Medium	Low	Weeks

Evidence quality is graded very high when there is at least one peer-reviewed paper or replicated public test showing the effect; high when there are multiple practitioner case studies pointing the same direction; medium when the direction is plausible from related disciplines but not directly tested; low when adoption is recent and outcomes are unknown.

Common antipatterns

Methods that sound like they should work, but do not — or that fire backward.

The first antipattern is schema-stuffing: declaring FAQPage, HowTo, or Review schema on pages that do not contain the corresponding visible content. Google has been explicit since 2023 that this is a manual-action risk for traditional search; the LLM era has not changed the calculus. Treat schema as documentation of what is on the page, not as a wishlist.

The second is AI-generated content without disclosure or editing pass. The 2024–2025 wave of mass-generated content has trained retrievers to recognise the surface patterns — uniform paragraph length, predictable phrase structure, low entity density, no methodology section. The penalty is not the AI involvement itself; it is the absence of the editorial pass that would make the piece worth citing.

The third is keyword stuffing rewritten as "LLM optimisation". Repeating "generative engine optimization" twenty times in an article does not make it more citable. What makes it citable is information density — a measurable claim, a number, a methodology, a quote — that LLMs cannot get from the other twenty pages that repeated the same phrase.

The fourth is cloaking — serving different content to crawlers than to humans. This was always against the rules in traditional SEO; in the LLM era the consequences are sharper, because the content that gets ingested into the training corpus is the content the crawler saw, which is then visible at generation time. A user catching the discrepancy is a brand-trust event you cannot recover from quickly.

The fifth is over-optimisation that collapses readability. Paragraphs broken into single-sentence units, headings every fifty words, lists where prose would be clearer — these are all "chunkability" rules taken past the point of usefulness. The model rewards extractable chunks; it does not reward shredded text.

The sixth, and the one we see most often on otherwise-strong sites, is shipping technical changes without measuring. Schema added but not validated. llms.txt shipped but no link list. Canonical tags added but not audited. Without measurement, the technical work is faith-based — and faith-based work tends to accumulate as cruft rather than as compounding wins.

A 30-day implementation roadmap

If you start tomorrow, the order that produces the most movement per hour of effort is technical first, content second, measurement in parallel from day one, authority over the long haul.

Week 1 — Technical baseline. Audit and fix Schema.org markup on the top 20 pages by traffic (Methods 1, 3, 5). Ship llms.txt if you do not already have one (Method 2). Validate everything in validator.schema.org and Google's Rich Results Test. Confirm canonical tags, robots.txt, and sitemap.xml are clean. Add .md alternates if you author in Markdown (Method 4). Add or update Open Graph cards on the top 20 pages (Method 13). Expect this to take three to five days of focused work.

Week 2 — Content rewrites. Pick the top five pages by traffic or strategic importance. Rewrite each one for answer-first H1 and dek (Method 6), chunkability (Method 7), entity coverage (Method 9), and inline citation density (Method 10). If the page is a research piece, add a "How we tested" or methodology block (Methods 8 and 23). The mechanical work is fast; the editorial discipline is the bottleneck.

Week 3 — Authority and topology. Audit internal linking topology and resolve every orphan article (Method 11). Add or improve author pages with verified identity and sameAs links (Method 14). Identify five primary-source citations the top pages currently route through paraphrases, and replace them with the originals. Spend the rest of the week on Method 16 — pick two outlets in your topical neighbourhood and figure out what useful primary research, data, or commentary you could send them in the next quarter.

Week 4 — Measurement baseline. Define your prompt battery (50 prompts to start — Method 19). Run the manual protocol against ChatGPT, Claude, Perplexity, and AI Overviews (Method 21). Compute citation rate and Share of Voice per engine (Methods 17–18). Pick one paid tool to evaluate in week 5 (Method 20). Write down where you are; everything you do from week 5 on should be measurable against this baseline.

Beyond day 30, the rhythm is: every shipped change goes through pre/post measurement on the same battery (Method 22). Every quarter, the battery itself gets reviewed and expanded. Every year, the technical baseline gets re-audited, because the schema vocabulary, the crawler list, and the engines themselves will all have moved.

FAQ

What is the difference between GEO, AEO, LLMO, and SGE?

GEO (Generative Engine Optimization) is the term most commonly used in the academic literature and was introduced by Aggarwal et al. (2024). AEO (Answer Engine Optimization) is the older, broader term, dating to roughly 2014–2015 and originally focused on featured snippets and voice assistants. LLMO (Large Language Model Optimization) is a near-synonym for GEO with slightly more emphasis on no-browsing scenarios. SGE (Search Generative Experience) was Google's product name for AI Overviews before the May 2024 rebrand — it is a product, not a discipline. We use GEO on this site for the reasons discussed in GEO vs AEO vs LLMO vs SGE: an honest taxonomy.

Do LLMs actually read my Schema.org markup?

Indirectly, yes. The major LLM-powered surfaces (ChatGPT Browse, Perplexity, Claude with web search, AI Overviews, Copilot) all rely on crawlers and parsers that consume Schema.org. The direct effect on citation rate has not been isolated in published research, but valid markup raises the probability your page is correctly parsed, classified, and surfaced.

Does llms.txt actually do anything if no major vendor has confirmed they read it?

The cost of shipping a well-formed llms.txt is roughly ten minutes; the worst case is that no agent reads it. The best case is that you are correctly indexed by the growing long tail of RAG-as-a-service products that have adopted the spec. It is a positive expected-value bet at low cost. The actual evidence on adoption is still emerging; see our llms.txt audit for the 100-domain snapshot.

How long does it take to be cited by ChatGPT after publishing a new article?

For ChatGPT with browsing enabled, the citation can be picked up within hours once the page is indexed. For the no-browsing case (the model citing from its training corpus), the lag is the gap between publishing and the next pre-training snapshot — generally months. Most of the citation lift practitioners can drive in a short horizon comes through the browsing pathway.

Can I pay to be cited by Perplexity, ChatGPT, or AI Overviews?

As of May 2026, there is no paid placement product in mainstream LLM citations. Perplexity has experimented with sponsored questions (clearly labelled), and Google's AI Overviews can include results from paid placements elsewhere on the SERP, but the citation slots themselves are organic. The economics will likely change over the next two years; the recommendation today is to invest in organic citation.

Is AI Overviews citation the same as ChatGPT citation?

No. AI Overviews inherits much of Google's ranking and quality signals (link authority, E-E-A-T, structured data). ChatGPT citations come through a Bing-powered retrieval layer with different weighting. Perplexity uses its own retrieval and ranking. The same article will often be cited by one of them and not the others, which is why cross-engine measurement matters.

What is the single highest-leverage GEO method?

Based on the published evidence (Aggarwal et al., 2024) and the case studies we trust, the highest-leverage method is primary research with methodology disclosure (Method 8). It is also the most expensive. For practitioners without the capacity for original research, the highest-leverage cheap method is answer-first chunking (Methods 6 and 7) — a half-day editorial pass on the top five pages.

How do I measure GEO progress if I cannot see search-engine data?

You measure on the surface where the user is — the LLM answer itself. Define a battery of prompts, run them on a schedule against each engine you care about, log the citations, and compute citation rate and Share of Voice (Methods 17–22). You do not need engine-side data to measure citation, because citation is observable on the rendered answer.

What we do not know yet

Areas where the public evidence is thinner than we would like, and where we plan to run primary research in the next two quarters.

The marginal effect of llms.txt adoption on citation rate, isolated from other variables. We have an adoption audit but not yet a controlled before/after on citation rate.

The relative weight of unlinked mentions versus linked backlinks in modern LLM corpora. The trade press treats the two as different goods; the evidence is mostly anecdotal.

The half-life of a citation lift after a content rewrite. Anecdotally, the gain is durable; we have not seen a published study tracking it over 90+ days.

The interaction effect between multiple methods. Aggarwal et al. (2024) tested strategies in isolation. We suspect (but have not shown) that the methods combine non-additively, with strong content design amplifying the effect of strong technical markup.

The behaviour of citation pipelines on multilingual content. Almost all published GEO testing is English-only. We will run a Polish/German/French replication later in 2026.

If you have data on any of these, email us — we are actively looking for collaborators on primary research.

How we wrote this

This guide is a synthesis, not a primary-research piece. We assembled the taxonomy by reading the published GEO literature (Aggarwal et al. 2024 is the anchor reference), the official documentation from the major LLM vendors (OpenAI, Anthropic, Google, Microsoft, Perplexity), the llms.txt proposal and the wider Answer.AI ecosystem, the Schema.org standard documents, and the body of practitioner case studies published between 2024 and 2026 in trade outlets. We did not run new citation tests for this article; the empirical work referenced is either prior published work or our own previously-published audits, each linked inline.

We disclose two operational details. First, this site has an editorial relationship with no GEO tool vendor, paid or unpaid; the tool list in Method 20 is alphabetical within categories and reflects market presence as of May 2026, not endorsement. Second, the framework was assembled in-house and is presented as one informed reading of the field, not as consensus. We welcome corrections — every section ends with a permalink, and you can email us with disagreement.

The next planned update of this taxonomy is August 2026, after we run a cross-engine citation battery against the 23 methods on a controlled article set. This version was first published 2026-05-19 and last reviewed 2026-06-08.

Internal QA

Pre-publish checklist (GEO Playbook v0.1) — completed for go-live on 2026-06-08:

A1–A5 pre-research filled
B1–B5 research / data — synthesis-only, no new dataset (disclosed in "How we wrote this")
C1–C10 writing / structure
D1–D8 citability optimization
E1–E10 technical SEO — Article + BreadcrumbList JSON-LD, OG, canonical, and the .md alias are emitted automatically by the scaffold (Velite + lib/seo.ts). The visible FAQ section also emits FAQPage JSON-LD once the FAQPage builder lands.
F1–F5 internal linking — inbound link from /foundations/what-is-geo; outbound links resolve to the live corpus only (no draft targets).
G1–G5 distribution prep — parked (distribution deferred this batch).
H1–H8 pre-publish QA — pnpm verify:geo passes at the cornerstone 100% threshold; JSON-LD validates; dark/light + mobile inherited from the scaffold layout.
I1–I7 post-publish — re-review when the gated studies (answer-first, schema A/B, tools benchmark) ship and the forward-references can become real links.

State promoted draft → live on 2026-06-08 after the honesty + link-integrity pass.

Cite this article

Reference this work in one of the formats below. The same strings are embedded in this page's Schema.org JSON-LD so LLM crawlers see them too.

GeoSalience (2026, May 19). How to Get Cited by LLMs: The Complete Taxonomy of GEO Methods. GeoSalience. https://geosalience.com/foundations/geo-methods-taxonomy

Changelog

Published — 19 May 2026
Updated — 8 June 2026
Last reviewed — 8 June 2026

GeoSalience

Editorial

Independent publication on Generative Engine Optimization. Primary research on how AI search engines retrieve, rank, and cite.

Twitter LinkedIn GitHub Bluesky

Cite this article

Changelog

Related

GEO vs AEO vs LLMO vs SGE: An Honest Taxonomy

What is Generative Engine Optimization (GEO)?

Knowledge Cutoff, Web Access, and Why It Matters